[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #10098(new unknown)

Opened 6 months ago

Lao collation is not linguistically correct

Reported by: mark Owned by: anybody
Component: unknown Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

Description

[Filed on behalf of Richard Wordingham]

I notice a very similar file lo.xml. When did Laos haul up the white
flag and more or less adopt the modern Thai collation order for Lao?

As there has been no answer to this question, I presume the surrender
has not happened. As my ticket submission was rejected as spam, would
someone kindly file a ticket along these lines:

==Lao collation is not linguistically correct==

The file collation/lo.xml contains the reckless falsehood "The root
collation order is valid for this language".

If phonetic Lao syllables were represented by single characters, Lao
collation would be a simple lexicographic order. It is therefore unable
to use anything but primary weights.

A Lao syllable may be considered to be composed of onset + vowel + coda
+ tone; the onset and vowel may be interleaved (as in Thai), and the
tone is represented by a mark following the onset and no later than
immediately after the vowel. There are two basic schemes ordering for
single syllables:

1) <onset-weight><coda-weight><vowel-weight><tone-weight>
2) <onset-weight><vowel-weight><coda-weight><tone-weight>

The first is the one most commonly used; the second is closer to the
CLDR default.

Unlike Thai, the vowel weighting for compound vowel symbols is not
composed from the individual vowels. For example, part of the ordering
is:

ເກະ < ເກ < ໂກະ < ໂກ < ເກາະ

However, the current collation yields
ເກ < ເກະ < ເກາະ < ໂກ < ໂກະ

This ordering is manifestly wrong.

I suggest that the reckless comment be amended to something like, "The
root collation is of some utility in sorting this language; accurate
collation appears to require large tables".

Yours faithfully,

Richard Wordingham.

Attachments

View

Add a comment

Modify Ticket

Action
as new
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.