Re: Collation - last character?

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Fri Mar 15 2002 - 17:23:52 EST


How about U+10ffff?
It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = "the last character".

It cannot be a Private-Use character, so few people will be tempted to tailor it to something other than its default UCA weight.
It also sorts highest in a Unicode-code point order-strcmp.

I think that at least in the ICU implementation of UCA, except if you tailor U+10ffff, it will give you the highest weight.

markus



This archive was generated by hypermail 2.1.2 : Fri Mar 15 2002 - 16:47:35 EST