You can not select more than 25 topics
			Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
		
		
		
		
		
			
		
			
				
					66 lines
				
				3.1 KiB
			
		
		
			
		
	
	
					66 lines
				
				3.1 KiB
			| 
											3 years ago
										 | ### Javascript porting of Markus Kuhn's wcwidth() implementation
 | ||
|  | 
 | ||
|  | The following explanation comes from the original C implementation: | ||
|  | 
 | ||
|  | This is an implementation of wcwidth() and wcswidth() (defined in | ||
|  | IEEE Std 1002.1-2001) for Unicode. | ||
|  | 
 | ||
|  | http://www.opengroup.org/onlinepubs/007904975/functions/wcwidth.html | ||
|  | http://www.opengroup.org/onlinepubs/007904975/functions/wcswidth.html | ||
|  | 
 | ||
|  | In fixed-width output devices, Latin characters all occupy a single | ||
|  | "cell" position of equal width, whereas ideographic CJK characters | ||
|  | occupy two such cells. Interoperability between terminal-line | ||
|  | applications and (teletype-style) character terminals using the | ||
|  | UTF-8 encoding requires agreement on which character should advance | ||
|  | the cursor by how many cell positions. No established formal | ||
|  | standards exist at present on which Unicode character shall occupy | ||
|  | how many cell positions on character terminals. These routines are | ||
|  | a first attempt of defining such behavior based on simple rules | ||
|  | applied to data provided by the Unicode Consortium. | ||
|  | 
 | ||
|  | For some graphical characters, the Unicode standard explicitly | ||
|  | defines a character-cell width via the definition of the East Asian | ||
|  | FullWidth (F), Wide (W), Half-width (H), and Narrow (Na) classes. | ||
|  | In all these cases, there is no ambiguity about which width a | ||
|  | terminal shall use. For characters in the East Asian Ambiguous (A) | ||
|  | class, the width choice depends purely on a preference of backward | ||
|  | compatibility with either historic CJK or Western practice. | ||
|  | Choosing single-width for these characters is easy to justify as | ||
|  | the appropriate long-term solution, as the CJK practice of | ||
|  | displaying these characters as double-width comes from historic | ||
|  | implementation simplicity (8-bit encoded characters were displayed | ||
|  | single-width and 16-bit ones double-width, even for Greek, | ||
|  | Cyrillic, etc.) and not any typographic considerations. | ||
|  | 
 | ||
|  | Much less clear is the choice of width for the Not East Asian | ||
|  | (Neutral) class. Existing practice does not dictate a width for any | ||
|  | of these characters. It would nevertheless make sense | ||
|  | typographically to allocate two character cells to characters such | ||
|  | as for instance EM SPACE or VOLUME INTEGRAL, which cannot be | ||
|  | represented adequately with a single-width glyph. The following | ||
|  | routines at present merely assign a single-cell width to all | ||
|  | neutral characters, in the interest of simplicity. This is not | ||
|  | entirely satisfactory and should be reconsidered before | ||
|  | establishing a formal standard in this area. At the moment, the | ||
|  | decision which Not East Asian (Neutral) characters should be | ||
|  | represented by double-width glyphs cannot yet be answered by | ||
|  | applying a simple rule from the Unicode database content. Setting | ||
|  | up a proper standard for the behavior of UTF-8 character terminals | ||
|  | will require a careful analysis not only of each Unicode character, | ||
|  | but also of each presentation form, something the author of these | ||
|  | routines has avoided to do so far. | ||
|  | 
 | ||
|  | http://www.unicode.org/unicode/reports/tr11/ | ||
|  | 
 | ||
|  | Markus Kuhn -- 2007-05-26 (Unicode 5.0) | ||
|  | 
 | ||
|  | Permission to use, copy, modify, and distribute this software | ||
|  | for any purpose and without fee is hereby granted. The author | ||
|  | disclaims all warranties with regard to this software. | ||
|  | 
 | ||
|  | Latest version: http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c | ||
|  | 
 | ||
|  | 
 | ||
|  | 
 |