From: Hans Aberg (haberg@math.su.se)
Date: Thu Jan 20 2005 - 18:52:37 CST
On 2005/01/20 20:16, Richard T. Gillam at rgillam@las-inc.com wrote:
> Good grief. We seem to be going through another round of "night of the
> living thread."
Have you found out first now. :-)
> This discussion has degenerated horribly from its original roots. I
> really don't think it's productive to get into "UTF-x is better than
> UTF-y" battles here, and I wish we could find a way to put a stop to it.
> It's like programming-language wars-- people go round and round and
> there's never any resolution. Suffice it to say that there are good,
> valid reasons to use each of the UTFs and good, valid reasons not to use
> each of the UTFs. Depending on your particular situation, any of the
> three might be the best fit. There's a reason all three exist.
At least for now. UTF-16 cannot be extended beyond the current range, but
UTF-8/32 can both be extended to 2^32 numbers, the size of a natural type.
Even though UTF-16 has a distinct legacy advantage, it likely does not have
that in the long run. So deprecating it seems to be a distinct possibility.
> Similarly, I don't think endless discussion of the BOM is productive. I
> think most people would agree that out-of-band methods of specifying the
> encoding scheme are preferable to using the BOM, but they're not always
> available. The BOM is what it is and it's not going away.
Well, in UTF-8 it has to go away as a requirement to be ignored in
processes: Either Unicode removes it in the standard, or one will see that
people just don't bother following the Unicode standard in that respect.
>It would
> have been better if the BOM hadn't been overloaded with the "zero-width
> non-breaking space" semantic-- if it had just been considered a no-op--
This is a suggestion I once made about Unicode file/stream contents markers
that do not have any other semantics. But that suggestion was turned down,
as contrary to the Unicode spirit. Then the use of BOM's just show how
dangerous it is for Unicode to neglect the users needs and concerns: The
needed features will simply appear anyway, but outside Unicode then. And if
Unicode tries to patch it up, poor constructions, such as the BOM one, will
appear.
Hans Aberg
This archive was generated by hypermail 2.1.5 : Thu Jan 20 2005 - 18:54:34 CST