As a by-product of our recent work on collation, we developed a method of
Unicode compression that is similar to SCSU, in that small alphabets are
about a byte per character and large alphabets are about two bytes per
character.
The main difference from SCSU is that this method preserves binary order. As
this is a hot topic right now, I thought it might be of interest. The latest
draft description is on http://oss.software.ibm.com/icu/develop/bocu.htm.
Comments are welcome.
Mark
—————
πάντων μέτρον ἄνθρωπος — Πρωταγόρας
This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT