Re: Names for UTF-8 with and without BOM

From: Tex Texin (
Date: Sat Nov 02 2002 - 21:17:02 EST

  • Next message: Tex Texin: "Re: Names for UTF-8 with and without BOM"


    Doug Ewell wrote:
    > Tex Texin <tex at i18nguy dot com> wrote:
    > > However, I didn't realize that parsers were to allow for the
    > > possibility of different signatures.
    > > So a parser has to worry about scsu signatures, etc....
    > A parser only *has* to read UTF-8 without signature and UTF-16 with
    > signature.

    Yes, I thought so until I saw Michka's note. And I thought that gave me
    100% utf-8 coverage.
    Apparently I would be leaving out the thousands ;-) that edit xml with

    It *may* read other encodings of its own choosing, including
    > ISO 8859-1, SCSU, JOECODE, or US-BSCII. (However, I can't find anything
    > that allows for SCSU with signature, which is a shame since UTS #6
    > encourages the signature.)

    Can I stand on the other side of the fence now and refer to market
    forces when it comes to ISO 8859 etc. ? ;-)

    Anyway, I think you understood the context of my whines-- It was just
    reaction to this silliness with open-ended signatures...


    Tex Texin   cell: +1 781 789 1898
    Xen Master                
    Making e-Business Work Around the World

    This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 21:45:55 EST