Re: Names for UTF-8 with and without BOM

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Nov 02 2002 - 16:27:17 EST

  • Next message: Doug Ewell: "Re: Names for UTF-8 with and without BOM"

    Tex Texin <tex at i18nguy dot com> wrote:

    > I didn't think the XML standard allowed for utf-8 files to have a BOM.
    > The standard is quite clear about requiring 0xFEFF for utf-16.
    > I would have thought a proper parser would reject a non-utf-16 file
    > beginning with something other than "<".

    The standard explicitly allows UCS-4, UTF-16, and UTF-8 files to begin
    with a BOM. See Appendix F.1, "Detection Without External Encoding
    Information":

    http://www.w3.org/TR/REC-xml#sec-guessing

    -Doug Ewell
     Fullerton, California



    This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 17:01:04 EST