Re: What does it mean to "not be a valid string in Unicode"?

From: Markus Scherer <markus.icu_at_gmail.com>
Date: Mon, 7 Jan 2013 11:29:42 -0800

On Mon, Jan 7, 2013 at 10:48 AM, Doug Ewell <doug_at_ewellic.org> wrote:

> Markus Scherer <markus dot icu at gmail dot com> wrote:
>
> > Also, we commonly read code points from 16-bit Unicode strings, and
> > unpaired surrogates are returned as themselves and treated as such
> > (e.g., in collation). That would not be well-formed UTF-16, but it's
> > generally harmless in text processing.
>
> But still non-conformant.
>

Not really, that's why there is a definition of a 16-bit Unicode string in
the standard.

markus
Received on Mon Jan 07 2013 - 13:32:14 CST

This archive was generated by hypermail 2.2.0 : Mon Jan 07 2013 - 13:32:15 CST