Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

From: Michael \(michka\) Kaplan (michka@trigeminal.com)
Date: Mon Nov 04 2002 - 11:07:47 EST

Next message: Michael Everson: "Re: Header Reply-To"

Previous message: Otto Stolz: "Re: `` ", ` '"
In reply to: Joseph Boyle: "RE: PRODUCING and DESCRIBING UTF-8 with and without BOM"
Next in thread: Edward H Trager: "Re: PRODUCING and DESCRIBING UTF-8 with and without BOM"
Reply: Edward H Trager: "Re: PRODUCING and DESCRIBING UTF-8 with and without BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Joesph,

> Software currently under development could use the identifiers for
choosing
> whether to require or emit BOM, like the file requirements checker I have
to
> write, and ICU/uconv.

Lets separate that into the two issuse it represents:

EMITTING: They could simply choose globally whether to emit the BOM or not.
If they wanted to get "fancy" they could have a command line option which
said whether to emit the bytes or not. But that is optional.

INCOMING TEXT: Trivial to simply chek. I say (once again) its THERE BYTES.
If hey are there then there is a BOM. Simple.

> The inability to update to one standard all possible consuming software
one
> might encounter (or for that matter human customers' opinions) is
precisely
> why producing and checking software has to handle both possibilities.

But the "both possibilities" are trivial adn its by no means dificult to do.
Having a good program that refuses to do a little work to handle three bytes
is like someone who runs a 100 mile marathon and then refuses to cross the
finish line because the line is yellor instead of white.

> What would you mean by "the right thing" as far as emitting BOM? Should
file
> conversion programs only allow output of non-BOM? (or with-BOM?) Or should
> they take the specification in an argument separate from the charset name?
> As said before this unnecessarily requires extra logic.

Already answered --- they can make a global decision, like notepad or other
programs do. Especially if the progammer finds the idea of setting it as a
huge hardship, they can skip that work and simply choose whether they want
it or not....

I plead with you -- keep it SIMPLE. :-)

MichKa

Next message: Michael Everson: "Re: Header Reply-To"
Previous message: Otto Stolz: "Re: `` ", ` '"
In reply to: Joseph Boyle: "RE: PRODUCING and DESCRIBING UTF-8 with and without BOM"
Next in thread: Edward H Trager: "Re: PRODUCING and DESCRIBING UTF-8 with and without BOM"
Reply: Edward H Trager: "Re: PRODUCING and DESCRIBING UTF-8 with and without BOM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Nov 04 2002 - 11:53:13 EST