RE: MS/Unix BOM FAQ again (small fix)

From: jarkko.hietaniemi@nokia.com
Date: Thu Apr 11 2002 - 12:21:44 EDT


> Mark Davis <mark@macchiato.com> wrote:
>
> > - when one of the BOM-allowing UTFs starts with a BOM, you know the
> > encoding*, and you strip off the BOM when you get the content.
> >
> > *assuming that no UTF-16 file has U+0000 as the first character.
>
> In the real world, this is a pretty good assumption -- almost as good,

A simple test page of UTF-16 encoded U+0000...U+00FF comes to mind. But yes,
I'm being mean.

> in fact, as the one I've been stating for years: that no Unicode file
> will have a zero-width no-break space (intended as such) as the first
> character.
>



This archive was generated by hypermail 2.1.2 : Thu Apr 11 2002 - 11:26:20 EDT