From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Nov 30 2003 - 07:24:35 EST
Doug Ewell writes:
> Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
>
> > I've tried to experiment a collation algorithm to implement UCA by the
> > same system as used in UCD decompositions, but with added (and
> > sometimes modified) decompositions. This system creates new "code
> > points" needed to represent only <font> compatibility differences,
> > ligatures, or alternate forms, as a decomposition of the existing
> > compatibility character, into more basic characters exposed with
> > primary differences in UCA, plus these new characters given "variable"
> > collation weights, which may be ignorable in applications which ignore
> > extra levels. This encoding uses a 31 bit code space, which is still
> > highly compressible, but still representable with the UTF-8 TES (but
> > they are not containing Unicode code points) or similar ad-hoc
> > representation.
>
> Please don't use UTF-8 to encode anything other than Unicode code
> points.
As long as I use it internally for intermediate processing, I can do what
I want. For now it is just a convenient way to represent variable size
integers up to 31 bits (in fact I use it to represent 32 bit signed
integers, but the two highest bits are equal).
Of course if I still use it to represent something else thzn codepoints
in some published data or text, I will rename it and won't keep the
same charset label. But it's highly probable that this will not be the
most efficient representation (due to its byte-oriented splitting), and
a more compact or easier to process serialization could require an
alternate encoding scheme (or transfer syntax).
__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com
__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com
This archive was generated by hypermail 2.1.5 : Sun Nov 30 2003 - 07:51:51 EST