RE: How is UTF8, UTF16 and UTF32 encoded?

From: Theodore H. Smith (delete@softhome.net)
Date: Thu May 30 2002 - 08:04:53 EDT

Previous message: Theodore H. Smith: "Re: Why isnt the posting address on the list?"
Maybe in reply to: Theodore H. Smith: "How is UTF8, UTF16 and UTF32 encoded?"
Next in thread: i18nGuy Tex Texin: "Re: How is UTF8, UTF16 and UTF32 encoded?"
Next in thread: Suzanne M. Topping: "RE: How is UTF8, UTF16 and UTF32 encoded?"
Reply: i18nGuy Tex Texin: "Re: How is UTF8, UTF16 and UTF32 encoded?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Many of the explanations of UTF-8 discuss encoding of code
> points on Code
> Planes 1-16 using the intermediate concept of surrogates as in
> UTF-16. I
> believe that this is both unnecessary and misleading, as UTF-8 is
> fundamentally a direct 21-bit encoding scheme, as may be seen in the
> attached document. So, I believe that the concept of surrogates is not
> relevant for UTF-8 encoding on Code Planes above the BMP.
>
> This is a slightly different explanation of how UTF-8 works,
> written by me
> for the Ultracode(r) bar code spec (Ultracode encodes all of Unicode 3
> directly). If any Unicodotti find any errors in it... please
> let me know!

You sent me a file that explains things, but its in word format
(I think,
its .doc) and I don't have MS Word. I have very few MS things
fortunately.
Just MSIE is all.

Thanks anyhow. This whole bit encoding is kind of technical, and I guess
I could do my own calculations and stuff to get some kind of
feel for what
the conversion code does to a character, but I was hoping more for some
illustrative examples. Like, lets say we take character XX, and so first
we see how many trailing chars it has like this, and etc giving a step
by step example... Almost like code but with the intermediate values
listed and explained.

(Once again I almost sent this to ecartis)

--
     Theodore H. Smith - Macintosh Consultant / Contractor.
     My website: <www.elfdata.com/>

Previous message: Theodore H. Smith: "Re: Why isnt the posting address on the list?"
Maybe in reply to: Theodore H. Smith: "How is UTF8, UTF16 and UTF32 encoded?"
Next in thread: i18nGuy Tex Texin: "Re: How is UTF8, UTF16 and UTF32 encoded?"
Next in thread: Suzanne M. Topping: "RE: How is UTF8, UTF16 and UTF32 encoded?"
Reply: i18nGuy Tex Texin: "Re: How is UTF8, UTF16 and UTF32 encoded?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Thu May 30 2002 - 12:25:43 EDT