Re: UTF-12!

From: Doug Ewell (doug@ewellic.org)
Date: Mon Feb 28 2011 - 11:12:55 CST

Next message: Doug Ewell: "Re: UTF-12!"

Previous message: Andrey V. Lukyanov: "Re: UTF-12!"
Maybe in reply to: vanisaac@boil.afraid.org: "Re: UTF-12!"
Next in thread: Doug Ewell: "Re: UTF-12!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Petr Tomasek <tomasek at etf dot cuni dot cz> wrote:

> Hm, what about UTF-64? Allmost everyone has 64bit machines today...

Marco Cimarosti, a former co-offender in creating experimental
encodings, described UTF-64 in May 2001. It used 63 bits to encode a
block of either (a) nine 7-bit Basic Latin characters or (b) three
21-bit characters, one of which was presumably not Basic Latin, together
with a 64th bit to indicate the type of block.

Van's sarcastic algorithm brings up a few additional goals to add to my
list:

• code units align with machine boundaries (8, 16, 32 bits)
• unique encoded form for each character
• unique encoded form for each character in context, or for each text
• minimize or avoid state

Remember that one point of this list is to demonstrate that not all
goals can be met by a single encoding.

Speaking of goals, Thomas' claim that UTF-c "avoids over-long forms of
characters" turns out not to be true, since characters belonging to the
selected 64-block can still be encoded using the long form. Encouraging
users to use the shortest form (like UTF-8) is not the same as
syntactically not providing a non-shortest form (like UTF-16 and -32).

--
Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org
RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s

Next message: Doug Ewell: "Re: UTF-12!"
Previous message: Andrey V. Lukyanov: "Re: UTF-12!"
Maybe in reply to: vanisaac@boil.afraid.org: "Re: UTF-12!"
Next in thread: Doug Ewell: "Re: UTF-12!"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Feb 28 2011 - 11:17:39 CST