CLDR Ticket #10098(new unknown)
Opened 4 weeks ago
Lao collation is not linguistically correct
|Reported by:||mark||Owned by:||anybody|
[Filed on behalf of Richard Wordingham]
I notice a very similar file lo.xml. When did Laos haul up the white
flag and more or less adopt the modern Thai collation order for Lao?
As there has been no answer to this question, I presume the surrender
has not happened. As my ticket submission was rejected as spam, would
someone kindly file a ticket along these lines:
==Lao collation is not linguistically correct==
The file collation/lo.xml contains the reckless falsehood "The root
collation order is valid for this language".
If phonetic Lao syllables were represented by single characters, Lao
collation would be a simple lexicographic order. It is therefore unable
to use anything but primary weights.
A Lao syllable may be considered to be composed of onset + vowel + coda
+ tone; the onset and vowel may be interleaved (as in Thai), and the
tone is represented by a mark following the onset and no later than
immediately after the vowel. There are two basic schemes ordering for
The first is the one most commonly used; the second is closer to the
Unlike Thai, the vowel weighting for compound vowel symbols is not
composed from the individual vowels. For example, part of the ordering
ເກະ < ເກ < ໂກະ < ໂກ < ເກາະ
However, the current collation yields
ເກ < ເກະ < ເກາະ < ໂກ < ໂກະ
This ordering is manifestly wrong.
I suggest that the reckless comment be amended to something like, "The
root collation is of some utility in sorting this language; accurate
collation appears to require large tables".