Re: Names for UTF-8 with and without BOM

From: Tex Texin (tex@i18nguy.com)
Date: Sat Nov 02 2002 - 14:08:57 EST

  • Next message: John Hudson: "Re: ct, fj and blackletter ligatures"

    "Michael (michka) Kaplan" wrote:
    > > .xml UTF-8N Some XML processors may not cope with BOM
    >
    > Maybe they need to upgrade? Since people often edit the files in notepad,
    > many files are going to have it. A parser that cannot accept this reality is
    > not going to make it very long.

    I didn't think the XML standard allowed for utf-8 files to have a BOM.
    The standard is quite clear about requiring 0xFEFF for utf-16.
    I would have thought a proper parser would reject a non-utf-16 file
    beginning with something other than "<".

    (The fact that notepad puts it there should be irrelevant.)

    Am I wrong about XML and the utf-8 signature?

    tex

    -- 
    -------------------------------------------------------------
    Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
    Xen Master                          http://www.i18nGuy.com
                             
    XenCraft		            http://www.XenCraft.com
    Making e-Business Work Around the World
    -------------------------------------------------------------
    


    This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 14:44:06 EST