> Mark Davis <mark@macchiato.com> wrote:
>
> > - when one of the BOM-allowing UTFs starts with a BOM, you know the
> > encoding*, and you strip off the BOM when you get the content.
> >
> > *assuming that no UTF-16 file has U+0000 as the first character.
>
> In the real world, this is a pretty good assumption -- almost as good,
A simple test page of UTF-16 encoded U+0000...U+00FF comes to mind. But yes,
I'm being mean.
> in fact, as the one I've been stating for years: that no Unicode file
> will have a zero-width no-break space (intended as such) as the first
> character.
>
This archive was generated by hypermail 2.1.2 : Thu Apr 11 2002 - 11:26:20 EDT