RE: UTF-8 and UTF-16

From: Marco.Cimarosti@icl.com
Date: Fri Oct 06 2000 - 04:20:36 EDT

Next message: Piotr Trzcionkowski: "Re: UTF-8 and UTF-16"
Previous message: Torsten Mohrin: "Re: FON fonts i18n"
Maybe in reply to: George Zeigler: "UTF-8 and UTF-16"
Next in thread: Piotr Trzcionkowski: "Re: UTF-8 and UTF-16"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I muttered this incomprehensible paragraph:
> - UTF-16 has 16-bit units ("words") and uses 1 or 2 units per
> character. Characters 000000 to 00FFFF use the corresponding
> word; higher values use a pair of "surrogates", the first one
> ("high") being in . It too exists in the same 3 variants as
> bove: little-endian, high-endian, and BOM-marked.

(The passage above demonstrates that even the FAQ of FAQ's my be puzzling,
if you cut away random chunks from it.;-) Sorry, I'm a little bit under
pressure; this is what I meant:

- UTF-16 has 16-bit units ("words") and uses 1 or 2 units per character.
Characters 000000 to 00FFFF use the corresponding word; higher values use a
pair of "surrogates", the first one ("high") being in range D800 to DBFF,
the second one ("low") in range DC00 to DFFF. It too exists in the same 3
variants as above: little-endian, big-endian, and BOM-marked.

_ Marco

Next message: Piotr Trzcionkowski: "Re: UTF-8 and UTF-16"
Previous message: Torsten Mohrin: "Re: FON fonts i18n"
Maybe in reply to: George Zeigler: "UTF-8 and UTF-16"
Next in thread: Piotr Trzcionkowski: "Re: UTF-8 and UTF-16"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT