From: Doug Ewell (dewell@adelphia.net)
Date: Sun Sep 24 2006 - 17:02:13 CST
Addison Phillips <addison at yahoo dash inc dot com> wrote:
> The BOM is often rendered in the page, throwing off other display
> elements. One common problem on Windows is the prevalence of editors
> (Notepad!!) that add the UTF-8 BOM to text files stored as "UTF-8".
> While one might expect this to act as a "no-op" character, in
> practice, it isn't.
It should, though. A process that claims to be able to "support
Unicode" should at least be able to follow the simple rule, "If the file
or stream starts with EF BB BF, throw them away and treat the remainder
of the file or stream as UTF-8."
Even the W3C FAQ says: "In some browsers, the presence of a UTF-8
signature will cause the browser to interpret the text as UTF-8
regardless of any character encoding declarations to the contrary."
That's exactly what it should do.
The argument about accidentally throwing away a U+FEFF that was intended
as a ZWNBSP is becoming increasingly irrelevant; U+2060 has been
recommended over ZWNBSP for over 4 years now, and few applications used
ZWNBSP anyway.
-- Doug Ewell Fullerton, California, USA http://users.adelphia.net/~dewell/ RFC 4645 * UTN #14
This archive was generated by hypermail 2.1.5 : Sun Sep 24 2006 - 17:16:50 CST