Indic Syllabic Categories
richard.wordingham at ntlworld.com
Fri May 9 17:57:52 CDT 2014
Is the provisional property 'Indic_Syllabic_Category' defined by
anything deeper than the UCD file IndicSyllabicCategory itself? I
started to review the current assignments and realised I couldn't
explain the division between Vowel_Independent and Consonant. Is the
division meant to be rational or, although possibly consistent within a
script, is it arbitrary?
For example, why are U+17A2 KHMER LETTER QA and U+0E24 THAI CHARACTER
RU of category Consonant, while U+1021 MYANMAR LETTER A and U+1A50 TAI
THAM LETTER UU are of category Vowel_Independent? Is this difference
intended to matter? Both U+17A2 and U+1021 combine freely with
dependent vowels, while U+0E24 and U+1A50 combine with very few
dependent vowels (I think just U+0E45 THAI CHARACTER LAKKHANGYAO and
U+1A63 TAI THAM VOWEL SIGN AA respectively).
Another puzzle is that U+1038 MYANMAR SIGN VISARGA of category Visarga
while U+19B0 NEW TAI LUE VOWEL SIGN VOWEL SHORTENER is of category
Vowel_Dependent. Both may follow a final consonant (one of category
Consonant_Final in the case of Tai Lue!) without implying a new
Is the property meant to be tailorable? For example, there are
encoded characters in the Khmer script that serve as tone marks when it
is used to write Thai.
More information about the Unicode