Re: Collation - last character?

From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Mar 20 2002 - 14:37:36 EST


David Hopwood said:

> > At 09:01 AM 3/19/02 -0800, Yves Arrouye wrote:
> > >TUS does not prevent anyone to put noncharacter code points in Unicode
> > >strings. As a matter of fact, p. 23 of TUS 3.0 reads "U+FFFF is reserved
> > >for private program use as a sentinel or other signal." ....

> >
> > But it is *not* available to *users* to put into lists to make certain
> > elements sort at the end.
>
> No, but U+1FFFD is.

Make that U+10FFFD, of course.

Incidentally, in case anyone is interested, in the default table for
the Unicode Collation Algorithm, the character with the lowest primary
weight (other than zero, or variables set to be ignorable) is:

02D0 ; [.081F.0020.0002.02D0] # MODIFIER LETTER TRIANGULAR COLON

That is the value in the current table (inclusive of the Unicode 3.0.1
repertoire). In table which matches the current table under ballot
for ISO 14651, extending the repertoire to Unicode 3.1.0, the same
entry still has the lowest primary weight, but the absolute value
has changed to:

02D0 ; [.09D3.0020.0002.02D0] # MODIFIER LETTER TRIANGULAR COLON

--Ken



This archive was generated by hypermail 2.1.2 : Wed Mar 20 2002 - 15:35:32 EST