RE: Fwd: Wired 4.09 p. 130: Lost in Translation

From: Murray Sargent (murrays@microsoft.com)
Date: Wed Aug 28 1996 - 16:23:42 EDT


Pls note that UTF-8 is a 31-bit standard (not just a 24-bit standard),
so it offers a variable-length byte encoding of all 10646 characters let
alone all Unicode characters.

Murray
>-----Original Message-----
>From: unicode@Unicode.ORG [SMTP:unicode@Unicode.ORG]
>Sent: Wednesday, August 28, 1996 12:31 PM
>To: unicode@Unicode.ORG
>Subject: Re: Fwd: Wired 4.09 p. 130: Lost in Translation
>
> David> Interesting 16-bit vs. 32-bit issue for characters. (I guess
> David> nobody seriously considered 24-bit characters?)
>
> David> Anyway, I have an even more radical idea. Could Unicode support
> David> variable-length characters, so that one or more Unicode values
> David> would mean "shift"? This would allow quite a number of Chinese
> David> (etc.) characters to be represented in the second Unicode
> David> byte-pair.
>
>Literally speaking, the UTF8 form of Unicode (for the range 0x0000-0xFFFF) is
>a variable length (up to) 24-bit encoding, but does not exhibit the "shift"
>property in the sense you intended.
>
>I can say from experience that handling variable length encodings is as much
>of a pain as handling multiple character sets. Maintenance and debugging are
>annoyingly involved, not to mention other problems like font mapping and
>database issues.
>
>Unicode's answer to the space limitation is UTF16, which basically provides
>an
>"escape" into a much larger plane.
>
> David> Or am I being way too whimsical?
>
>Reasonable questions, I think. The last 10-15 years have seen numerous
>"shift" and "escape" schemes attempting to solve some of the representation
>problems. Few have survived. Those that survived will be used until a
>clearly superior successor appears on the scene. My personal opinion is that
>Unicode is on the right path to becoming that "clearly superior successor."
>-----------------------------------------------------------------------------
>mleisher@crl.nmsu.edu
>Mark Leisher "A designer knows he has achieved perfection
>Computing Research Lab not when there is nothing left to add, but
>New Mexico State University when there is nothing left to take away."
>Box 30001, Dept. 3CRL -- Antoine de Saint-Exup'ery
>Las Cruces, NM 88003
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT