From: Andrew Lipscomb (ewwa@chattanooga.net)
Date: Mon Jun 01 2009 - 09:01:59 CDT
> This quote say that it depends on how you read the standard which
> code
> points are invalid; perhaps someone here can clarify :-):
> http://en.wikipedia.org/wiki/UTF-8#Invalid_code_points
>
> In particular, it would be great to know if the range U+0080, ?, U
> +009F is invalid.
>
> Hans Aberg
Those code points (encoded properly) are valid. However, their
appearance may indicate that an error occurred in processing, as
the C1 controls would be rare in real Unicode text (and, with the
exception of U+0085, are discouraged in XML). They most often
arise by treating Windows-1252 as if it were ISO-Latin-1.
In other words, not invalid, but suspicious.
------------------------------------------------------via webmail----
Andrew Lipscomb
ewwa@chattanooga.net
This archive was generated by hypermail 2.1.5 : Mon Jun 01 2009 - 09:04:14 CDT