This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Tue Mar 1 09:45:40 CST 2016
Name: Ryusei Yamaguchi
Report Type: Public Review Issue
Opt Subject: Review for PRI #318 Proposed Update UAX #11, East Asian Width
1. The propose notes that 774 neutral characters and 25 ambiguous characters would be changed to wide characters. How is it going on the regulation? Ideally the characters unified with non-East Asian legacy characters should be ambiguous in the current rule. For example, U+263A WHITE SMILING FACE is not on the list of characters changed to wide. Should we change WHITE SMILING FACE to ambiguous? I think we shouldn't, but the current rule may say so. 2. The ambiguous characters are problematic. Character width have ambiguity as same as the emoji style. We introduced variation selectors for characters which we can't determine they are emoji or text. Heuristic solutions are often uncomfortable. Why don't we introduce VS for character width? Basically terminal emulators should respect the character width of the font they use, and use variation selector to instruct the emulator in the width of it. It would be the best strategy. 3. How are more-than-1-em characters? There are 2-em dash or 3-em dash. It may be out of scope on the UAX, but should be considered. Ambiguity of character width is general problem rather than East Asian.
Date/Time: Sat Mar 19 10:43:51 CDT 2016
Name: Ken Lunde (Editor, UAX #11)
Report Type: Public Review Issue
Opt Subject: Reply to Ryusei Yamaguchi's 2016-03-01 "Review for PRI #318 Proposed Update UAX #11, East Asian Width"
Yamaguchi-san, Thank you for submitting feedback for PRI #318: http://www.unicode.org/review/pri318/ Your comments will be discussed during UTC #147 in May, but as the editor of this UAX, I felt that you deserved a response prior to the UTC's discussion. Note that my comments below will be added to this PRI, and will also be discussed at UTC #147. > 1. The propose notes that 774 neutral characters and 25 ambiguous > characters would be changed to wide characters. How is it going on the > regulation? Ideally the characters unified with non-East Asian legacy > characters should be ambiguous in the current rule. For example, U+263A > WHITE SMILING FACE is not on the list of characters changed to wide. > Should we change WHITE SMILING FACE to ambiguous? I think we shouldn't, but > the current rule may say so. U+263A is currently set to N (East Asian > Neutral), and nothing in PRI #318 suggests that it be changed to A (East Asian > Ambiguous). The added recommendation to treat "emoji style" standardized variation sequences as though they were set to W would thus treat <263A FE0F> as East Asian Wide, which covers this and other similar cases. > 2. The ambiguous characters are problematic. Character width have > ambiguity as same as the emoji style. We introduced variation selectors > for characters which we can't determine they are emoji or text. > Heuristic solutions are often uncomfortable. Why don't we introduce VS > for character width? Basically terminal emulators should respect the > character width of the font they use, and use variation selector to > instruct the emulator in the width of it. It would be the best strategy. The characters that correspond to A (East Asian Ambiguous) are problematic in other ways beyond the scope of UAX #11, and it is up to implementations to resolve the ambiguity in their own way. I once proposed (in L2/14-006) the use of Standardized Variants to distinguish between Western and CJK use for a small set of characters, which seems somewhat related to what you are proposing, but the UTC rejected it. See: http://www.unicode.org/L2/L2014/14006-sv-western-vs-cjk.pdf Keep in mind that the primary purpose of UAX #11 is to guide developers toward a solution that eventually resolves a character's width into one of two possibilities: half-width or full-width. When a character is A, there is a good chance that it will resolve to W because it is generally treated as W in an East Asian context, but some circumstances may suggest that it resolve to H (some terminals, such as the Terminal app of OS X, have an option to force East Asian Ambiguous characters to be treated as East Asian Wide). The same is true of characters that are N, but the tendency is for such characters to resolve to H, but some conditions, such as "emoji style," may cause them to be treated as W. > 3. How are more-than-1-em characters? There are 2-em dash or 3-em dash. > It may be out of scope on the UAX, but should be considered. Ambiguity of > character width is general problem rather than East Asian. Such characters are simply out of the scope of UAX #11, and also out of scope of fixed-width column implementations, such as terminals. For implementations that do not need to force characters into one of two possible widths, UAX #11 serves no purpose, and instead the advance widths of the glyphs for each character, as specified in the selected font, should be used. Best... -- Ken