From: Francois Yergeau (FYergeau@alis.com)
Date: Mon Sep 29 2003 - 13:21:57 EDT
Jill Ramonsky wrote:
> First point - if no information is present, assume "us-ascii".
> Sounds extremely sensible to me.
Sounds very misguided to me.
> ASCII is the intersection of Latin-1, UTF-8, and various other
> commonly used encodings.
How does that make it more likely that guessing ASCII would be correct?
> Moreover, in order to even read the name of the encoding, the
> name of the encoding must have itself been encoded in something.
See Appendix F of the XML spec for how you can do much better than assuming
ASCII to read the encoding name.
> It makes sense to me to assume the absolute minimum. If you want
> more than the minimum, declare your encoding. This should not be
> a problem.
It makes much more sense to me to assume UTF-8, as XML does. If you want
*less* than that, declare your encoding. This is not a problem.
-- François
This archive was generated by hypermail 2.1.5 : Mon Sep 29 2003 - 14:22:48 EDT