Re: Several BOMs in the same file

From: Jungshik Shin (jshin@mailaps.org)
Date: Wed Mar 26 2003 - 02:41:28 EST

  • Next message: Otto Stolz: "Re: Detecting UTF-8 Locale Question"

    Marco Cimarosti wrote:

    >Kent Karlsson wrote:
    >
    >
    >>>I'm not going into the implementation part; just pointing out that
    >>>this issue is not something an operating system can ignore.
    >>>
    >>>
    >>"cat" and "cp" can and shall ignore it. They are octet-level
    >>file operations, attaching no semantics to the octets. Try "iconv".
    >>
    >>
    >
    >This byte-level operation is the just the default behavior. This basic
    >behavior should remain the default, of course.
    >
    >However, there already are a lot of options specific to text files, that
    >*do* attach character semantics to octets, such as the "-n" option to number
    >output lines:
    >
    >
      There are and BOM handling could arguably within its reach, but
     expecting poor cat(1) to do the codeset conversion is squarely against
    the Unix philosophy of doing well one thing at a time.
      Why bother if you can just use iconv(1) for the job combined with
    other tools?

    Jungshik



    This archive was generated by hypermail 2.1.5 : Wed Mar 26 2003 - 03:26:00 EST