From: Doug Ewell (dewell@adelphia.net)
Date: Sat Feb 15 2003 - 19:38:28 EST
Tom Gewecke <tom at bluesky dot org> wrote:
> --U+FEFF can appear (presumably by accident) at the beginning of any
> web page, but aside from those two cases where it is necessary, it is
> a ZWNBS and not a BOM. (As Michael pointed out, Mac IE 5.2.2 displays
> a Euro symbol).
But as I wrote earlier, a zero-width no-break space at the start of a
Web page should not disrupt the content or layout of the page in any
way. It's a space. It's zero-width. A Euro symbol is non-conformant
and just plain wrong; the page starts with the bytes EF BB BF and is
clearly marked as being UTF-8.
> Suppose a page has no charset/encoding specified in the markup. Does
> the presence of U+FEFF mean it should be presumed to be UTF-16? Some
> of my browsers behave this way.
Actually, the presence of the bytes FF FE or FE FF. You can't tell
whether they mean U+FEFF until you've decided what the encoding is.
IMHO, and I believe the HTML spec agrees, initial FE FF or FF FE is a
*really* strong hint of UTF-16ness that should not be casually
overlooked.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.5 : Sat Feb 15 2003 - 20:13:01 EST