From: Clark Cox (clarkcox3@mac.com)
Date: Fri May 07 2004 - 09:49:28 CDT
On May 07, 2004, at 08:08, Philippe Verdy wrote:
> From: "Chan Fook Sheng" <chanfooksheng@pacific.net.sg>
>> I am looking for unicode, utf-8 coonversion tools for windows
>> platform,
>> but can't find any on the web.
>>
>> can anyone direct me to some links?
>>
>> for example: the "/" character is 47 in decimal, 2F in hex.
>>
>> it can be represented in UTF-8 format as:
>> 1 byte: still 2F
>> 2 bytes: C0 AF (illegal)
>> 3 bytes: E0 80 AF (illegal)
>
> Thanks for keeping the indication that the last two are illegal with
> UTF-8. But
> you should have better never listed them (even if there still exists
> some legacy
> converters that will accept them, no one should generate them). Note
> also that
> UTF-8 encoded sequences can be up to 5 bytes long...
How is that possible. I was under the impression that a UTF-8 sequence
could never be more than 4 bytes (i.e. U+10FFFF becomes F4 8F BF BF).
-- Clark S. Cox III clarkcox3@mac.com http://homepage.mac.com/clarkcox3/ http://homepage.mac.com/clarkcox3/blog/B1196589870/index.html
This archive was generated by hypermail 2.1.5 : Fri May 07 2004 - 18:45:26 CDT