Grapheme clusters and east asian width
daniel.buenzli at erratique.ch
Tue Sep 15 20:45:27 CDT 2015
Is there any guidance on how to combine the information given by grapheme clusters and the east asian width property to do fixed-width layouts in terminal emulators ?
For example if we have:
U+AC01 ( 각 ) HANGUL SYLLABLE GAG
This will delimit a single grapheme cluster with east asian width W and hence 2 columns in a tty. However if we have it as the sequence:
U+1100 ( ᄀ ) HANGUL CHOSEONG KIYEOK
U+1161 ( ᅡ ) HANGUL JUNGSEONG A
U+11A8 ( ᆨ ) HANGUL JONGSEONG KIYEOK
This will delimit a single grapheme cluster, but if I try to add up their east asian widths (W, N, N), this would result in 4 columns.
Does something naïve like looking up only the east asian width of the first scalar value in the grapheme cluster and use 2 columns for it if this is F or W and 1 column otherwise work or are there counter examples where this breaks ? Or is there anything more clever that can be done ?
More information about the Unicode