Whoops! I made a significant mistake. I wrote:
>> ISO 10646 is 31 bits. All possible values should be allowed.
>> I do not know why Unicode have decided to grow their bits to
>> more than 16 bits, but not to all 31 bits of ISO 10646.
>> But that is no reason to not allow full 31 bits in UTF-8 encoded
>> text.
>
> There IS a reason: to allow all of Unicode to be expressed in UTF-8.
which may have been what caused Dan to reply:
> Yes, UTF-16 was done right. Unfortunately UTF-8 was done wrongly.
> UTF-8 should just like UTF-16 is compatible with code in the 16-bit
> space, been compatible with the first characters of 8 bits.
Of course, I should have said "to allow all of Unicode to be expressed
in UTF-16." UTF-8, at least in its original RFC 2279 incarnation, does
indeed allow the encoding of 31-bit ISO 10646. My bad.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT