From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Jan 09 2006 - 18:38:25 CST
Asmus noted:
> (For example, I assume,
> but have not verified, that i+j and ij in fact sort the same in the DUCET).
0049 ; [.103C.0020.0008.0049] # LATIN CAPITAL LETTER I
004A ; [.1054.0020.0008.004A] # LATIN CAPITAL LETTER J
0069 ; [.103C.0020.0002.0069] # LATIN SMALL LETTER I
006A ; [.1054.0020.0002.006A] # LATIN SMALL LETTER J
0132 ; [.103C.0020.000A.0132][.1054.0020.000A.0132] # LATIN CAPITAL LIGATURE IJ; QQKN
0133 ; [.103C.0020.0004.0133][.1054.0020.0004.0133] # LATIN SMALL LIGATURE IJ; QQKN
<0069, 006A> --> 103C.1054.0020.0020.0002.0002
<0133> --> 103C.1054.0020.0020.0004.0004
<0049, 004A> --> 103C.1054.0020.0020.0008.0008
<0132> --> 103C.1054.0020.0020.000A.000A
^^^^^^^^^ ^^^^^^^^^ ^^^^^^^^^
primary secondary tertiary
The difference is at the tertiary level (and defined in such a way that
the "ligatures" will in fact interleave with lowercase and uppercase sequences
of the same letters). The differences between IJ and I+J will be swamped by
all primary letter differences in a string, as well as by all accentual
differences.
This by default... one could, of course, tailor Dutch to make the IJ ligature
compatibility characters sort *exactly* like the sequence of I+J.
--Ken
This archive was generated by hypermail 2.1.5 : Mon Jan 09 2006 - 18:39:20 CST