From: Yung-Fong Tang (ftang@netscape.com)
Date: Tue Feb 25 2003 - 17:45:10 EST
so the UTF-8 sequence which represent U+FFFE U+FFFF and U+{1-11}FFF{E,F}
are consider legal in Unicode 4.0
Kenneth Whistler wrote:
>Frank Tang asked:
>
>
>
>>I am working on update the Mozilla UTF-8 code to incooperate the change
>>of UTF-8 definitation in Unicode 3.1 (make non-shortest form illegal,
>>and make 5-6 octets illegal) and Unicode 3.2 (make irregular form
>>illegal) now. I wonder do have any change of the UTF-8 definitation from
>>Unicode 3.2 to unicode 4.0? If we have, I would like to know that eariler.
>>
>>
>
>And the answer is no, there is no further change in the definition
>of UTF-8 from Unicode 3.2 to Unicode 4.0.
>
>There is considerable change to the text of the normative
>part of the standard, to systematically incorporate the changes
>to UTF-8, and to put UTF-8, UTF-16, and UTF-32 on an equal
>footing in the text, but there is no further substantive
>change in the definition of UTF-8 past the changes documented
>in UAX #28 for Unicode 3.2.
>
>In particular, all the legal UTF-8 byte sequences documented
>in Table 3.1B in Unicode 3.2 are incorporated exactly the same
>way in the corresponding table in Unicode 4.0.
>
>--Ken
>
>
>
>
This archive was generated by hypermail 2.1.5 : Tue Feb 25 2003 - 18:32:54 EST