From: Doug Ewell (dewell@roadrunner.com)
Date: Fri Jul 04 2008 - 13:02:05 CDT
Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
>> They are both UTF-16 code units and code points. They are not
>> Unicode scalar values.
>
> I don't think it is correct to say that U+20045 (both a character and
> a code point with scalar value hex 20045) will be created as U+D840
> U+DC45 with UTF-16; the latter indicates TWO separate TWO separate
> codepoints instead of one, assigned to TWO distinct NON-characters...
I took this straight out of the definitions in the TUS 5.0 book.
-- Doug Ewell * Arvada, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
This archive was generated by hypermail 2.1.5 : Fri Jul 04 2008 - 13:04:33 CDT