Size of Weights in Unicode Collation Algorithm

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Wed, 13 Mar 2013 18:38:28 +0000

One of the changes from Version 6.1.0 to 6.2.0 of the the UCA (UTS#10)
was to changed weights from being 16 bits to just being general
non-negative integers. Was this just to accommodate the 4th weight in
DUCET (scheduled for deletion in Version 6.3.0), or is it intended to do
away with the inconvenient concept of 'large weights'?

Previously, each of the four weights could be accommodated in 16, 16,
16 and 24 bits. How many bits may be needed for a DUCET collation
element now? Are we threatened with having to accommodate 36 bit
weights?

If it is not intended to do away with the 16-bit limit, then the
introduction to Section 3.0 should revert to describing the weights as
16 bits. Otherwise, there is a good deal of text that is wrong or in
need of overhaul. For example, a value FFFF will not function as
intended if the smallest explicit positive primary weight is 100,000.

I've not submitted this through formal feedback yet, as my feedback will
depend on what is intended.

Richard.
Received on Wed Mar 13 2013 - 13:43:21 CDT

This archive was generated by hypermail 2.2.0 : Wed Mar 13 2013 - 13:43:23 CDT