UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030

From: Carl W. Brown (cbrown@xnetinc.com)
Date: Thu Nov 14 2002 - 21:03:04 EST

Next message: Doug Ewell: "Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"

Previous message: Carl W. Brown: "RE: IBM AIX 5 and GB18030"
In reply to: Markus Scherer: "Re: IBM AIX 5 and GB18030"
Next in thread: Doug Ewell: "Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Reply: Doug Ewell: "Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Maybe reply: John McConnell: "RE: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Markus,

> You seem to suggest that there is a problem with 16-bit Unicode.
> It does take some effort to adapt
> UCS-2-designed functions for UTF-16, but it's not "rocket
> science" and works very well thanks to the
> Unicode allocation practice (common characters in the BMP).
> Making UTF-8/32 functions work with
> supplementary code points when they had assumed BMP-only
> operation probably took some work too.

Converting from UCS-2 to UTF-16 is just like converting from SBCS to DBCS.
For folks who think DBCS it is no problem. Those who went from DBCS to
Unicode to simplify their lives I am sure are not happy.

I think that worst problem is that many systems still sort in binary not
code point order. Then you get Oracle and the like wanting to set up a
UTF-8 variant that encode each surrogate rather than the character.

However, 16 bit characters were a hard enough sell in the good old days. If
we had started out withug 2bit characters we would still be dreaming about
Unicode.

Carl

Next message: Doug Ewell: "Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Previous message: Carl W. Brown: "RE: IBM AIX 5 and GB18030"
In reply to: Markus Scherer: "Re: IBM AIX 5 and GB18030"
Next in thread: Doug Ewell: "Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Reply: Doug Ewell: "Re: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Maybe reply: John McConnell: "RE: UTF-16 vs UTF-32 (was IBM AIX 5 and GB18030"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Nov 14 2002 - 21:49:42 EST