Ian Clifton <ian dot clifton at chem dot ox dot ac dot uk> wrote:
>> Does anyone know why ill-form occurred on the UTF-8? besides it
>> doesn't follow > the pattern of UTF-8 byte-sequences, i just
>> wondering how or why?
>
> There’s a lot about the conditions for the well-formedness of UTF-8
> sequences in Chapter 3 of the Standard:
>
> [...]
>
> Even if these conditions hold, however, a UTF-8 sequence might still
> be ill-formed, Table 3-7 exhaustively lists all the cases.
But the bottom line is, there's nothing ill-formed about James' original
example. It's perfectly good UTF-8. The visual similarity between the
digits in U+4E8C and the first and last bytes in <E4 BA 8C> is mostly
coincidental.
-- Doug Ewell | Thornton, Colorado, USA http://www.ewellic.org | @DougEwell Received on Tue Dec 11 2012 - 15:18:43 CST
This archive was generated by hypermail 2.2.0 : Tue Dec 11 2012 - 15:18:44 CST