From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Apr 22 2005 - 16:24:19 CST
From: "Rick McGowan" <rick@unicode.org>
> The 1.3 version of CLDR is now at at beta status, and available via
> <http://unicode.org/cldr/version/1.3.html>.
I just downloaded it, and I wonder why, in the "fr" collation file, we only
see these tailoring rules:
<rules>
<reset>ae</reset>
<s>æ</s>
<t>Æ</t>
<!--
<reset>A</reset>
<x><s>Æ</s><extend>E</extend></x>
<reset>a</reset>
<x><s>æ</s><extend>e</extend></x>
-->
</rules>
The rules for ae and AE French ligatures are correct, meaning that the ae or
AE ligatures have a secondary difference with the non-ligated vowels, no
primary difference with them, and then there's a tertiary difference between
the ae and AE ligatures.
But why isn't there something similar for the much more common oe and OE
French ligatures?
<rules>
<reset>ae</reset>
<s>æ</s>
<t>Æ</t>
<reset>oe</reset>
<s>œ/s>
<t>Œ</t>
<!--
<reset>A</reset>
<x><s>Æ</s><extend>E</extend></x>
<reset>a</reset>
<x><s>æ</s><extend>e</extend></x>
<reset>O</reset>
<x><s>Œ</s><extend>E</extend></x>
<reset>o</reset>
<x><s>œ</s><extend>e</extend></x>
-->
</rules>
Aren't they missing? I can't see them in the "root" collation file. Are the
oe/OE ligatures already in the Default UCA Collation Elements Table (unlike
the ae and AE French ligatures that I know will sort as separate letters in
other languages like Dutch, and for which tailoring is justified here).
Also isn't the commented form preferable, as the uncommented form uses a
character pair which should inevitably be converted into the second form in
a DFA-based collator engine? Is UCA now recommending NFA-based collator
engines? I think that the deterministic form for French is small enough to
be used instead of the non-deterministic form (note: any DFA form is also a
NFA form, the reverse is false of course).
This archive was generated by hypermail 2.1.5 : Fri Apr 22 2005 - 16:26:49 CST