RE: Several BOMs in the same file

From: Kent Karlsson (kentk@md.chalmers.se)
Date: Tue Mar 25 2003 - 11:43:34 EST

  • Next message: Noah Levitt: "Re: Detecting UTF-8 Locale Question"

    > In that case, removing the BOM that would end up somewhere in the
    > middle is the natural thing to do, just as removing the EOF marker
    > at the end of the first file is.

    There is no "EOF marker" at the end of a file. At least not in
    in modern file systems. There is no NULL, CTRL-Z, or CTRL-D
    or anything similar signifying the end of a file. Such "characters"
    can be part of a file, though. Also text files. Not just at the end,
    but anywhere.

    > I'm not going into the implementation part; just pointing out that
    > this issue is not something an operating system can ignore.

    "cat" and "cp" can and shall ignore it. They are octet-level
    file operations, attaching no semantics to the octets. Try "iconv".

                    /kent k

    > Pim Blokland



    This archive was generated by hypermail 2.1.5 : Tue Mar 25 2003 - 13:00:42 EST