From: John Cowan (cowan@mercury.ccil.org)
Date: Fri Feb 21 2003 - 21:59:54 EST
Markus Scherer scripsit:
> Yes. Any reasonable SCSU encoder will stay in the ASCII-compatible
> single-byte mode until it sees a character from beyond Latin-1. Thus
> the encoding declaration will be ASCII-readable.
Indeed, there is no such requirement. A parser can perfectly well handle
EBCDIC or other non-ASCII-compatible encodings provided a proper declaration
expressed in that encoding is present.
To be sure, some encodings, like US-BSCII, are problematic. US-BSCII is
the same as US-ASCII except that 0x41 is B and 0x42 is A; the trouble
being of course that the string "US-ASCII" encoded in US-ASCII uses the
same bytes as the string "US-BSCII" encoded in US-BSCII. But such a thing
is not likely to happen except through perversity such as this.
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org To say that Bilbo's breath was taken away is no description at all. There are no words left to express his staggerment, since Men changed the language that they learned of elves in the days when all the world was wonderful. --_The Hobbit_
This archive was generated by hypermail 2.1.5 : Fri Feb 21 2003 - 22:43:48 EST