Combining Class of Thai Nonspacing_Marks

From: Gerriet M. Denkmann <gerrietm_at_icloud.com>
Date: Mon, 3 Apr 2017 14:12:51 +0700

The Combining Class is used for normalisation of strings.
Normalisation of strings is important for filenames in filesystems.

As far as I know, a Thai consonant (Lo, Other_Letter) can have several Nonspacing_Marks.
This cluster of nonspacing marks can contain at most one top/bottom vowel and at most one tone/other mark.
There is no syntactically meaning in the order of these nonspacing marks.

So: All top/bottom vowels should have Combining Class 103, all tone/other marks have Combining Class 107.

Is there a reason for having top vowels or other-marks with Combining Class 0, Not_Reordered?

With the current choice of Combining Class both consonant + mark + top vowel and consonant + top vowel + mark are normalised, so that one can have two files with these (identically looking, but different) names, which is rather confusing.

Here a list of all nonspacing marks in the Thai script:

top vowels (Combining Class 0, Not_Reordered): ← this seems to be wrong; should be 103
THAI CHARACTER MAI HAN-AKAT ั
THAI CHARACTER SARA I ิ
THAI CHARACTER SARA II ี
THAI CHARACTER SARA UE ึ
THAI CHARACTER SARA UEE ื

bottom vowels (Combining Class 103):
THAI CHARACTER SARA U ุ
THAI CHARACTER SARA UU ู

tone-marks (Combining Class 107):
THAI CHARACTER MAI EK ่
THAI CHARACTER MAI THO ้
THAI CHARACTER MAI TRI ๊
THAI CHARACTER MAI CHATTAWA ๋

other-marks (Combining Class 0, Not_Reordered): ← this seems to be wrong, should be 107
THAI CHARACTER MAITAIKHU ็
THAI CHARACTER THANTHAKHAT ์
THAI CHARACTER NIKHAHIT ํ
THAI CHARACTER YAMAKKAN ๎

other-marks (Combining Class 9, Virama)
THAI CHARACTER PHINTHU ฺ

Gerriet.
Received on Mon Apr 03 2017 - 10:56:34 CDT

This archive was generated by hypermail 2.2.0 : Mon Apr 03 2017 - 10:56:34 CDT