Re: PDUTR #26 posted

From: DougEwell2@cs.com
Date: Wed Sep 19 2001 - 01:12:49 EDT


David Hopwood and Carl Brown graciously corrected me:

>> I don't agree that irregular UTF-8 sequences in general can only decode to
>> characters above 0xFFFF.
>
> That's why I specifically referred to irregular sequences as defined by
> Unicode 3.1 (i.e. UAX #27).

I stand corrected. That's what I get for not having a copy of UAX #27 handy.

Non-shortest sequences, of course, used to be considered irregular (not
invalid) in Unicode 3.0, before the Technical Committee wisely tightened up
the definition of UTF-8.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Wed Sep 19 2001 - 00:12:24 EDT