Re: Names for UTF-8 with and without BOM

From: Doug Ewell (dewell@adelphia.net)
Date: Sun Nov 03 2002 - 16:20:36 EST

  • Next message: Doug Ewell: "Re: Header Reply-To"

    Mark Davis <mark dot davis at jtcsv dot com> wrote:

    > Little probability that right double quote would appear at the start
    > of a document either. Doesn't mean that you are free to delete it
    > (*and* say that you are not modifying the contents).

    True, but right double quote:

    (a) has a visible glyph with a well-defined human-readable meaning,
    (b) isn't defined by Unicode as having a text-processing influence on
    adjoining characters (leaving the question wide open of what to do when
    there are fewer than two adjoining characters),
    (c) doesn't have a second meaning as a signature that under certain
    conditions can be stripped.

    > I agree that when the UTC decides that a BOM is *only* to be used as a
    > signature, and that it would be ok to delete it anywhere in a document
    > (like a non-character), then we are in much better shape. This was, as
    > a matter of fact proposed for 3.2, but not approved. If we did that
    > for 4.0, then there would be much less reason to distinguish UTF-8
    > 'withBOM' from UTF-8 'withoutBOM'.

    Every one of us will be grateful when that day comes.

    -Doug Ewell
     Fullerton, California



    This archive was generated by hypermail 2.1.5 : Sun Nov 03 2002 - 17:01:08 EST