From: Tex Texin (tex@i18nguy.com)
Date: Sat Nov 02 2002 - 19:16:11 EST
John,
I understand the flexibility of XML to use different encodings.
However, I didn't realize that parsers were to allow for the possibility
of different signatures.
So a parser has to worry about scsu signatures, etc....
Whereas XML is so fussy about which characters it accepts, I am
surprised at its flexibility for signatures.
So when the parser gets JOECODE, I can understand ignoring the signature
and autodetection, but exactly how does it find the first "<"?
It must have to try all of the encodings known to it... ugh.
tex
John Cowan wrote:
>
> Tex Texin scripsit:
>
> > However, that leaves open the question whether only the Unicode
> > transform signatures are acceptable or other signatures are also
> > allowed. So if a vendor defines a code page, and defines a signature
> > (perhaps mapping BOM/ZWNSP specifically to some code point or byte
> > string) does that then become acceptable?
>
> IMHO yes. XML documents are not *required* to be in one of the character
> sets that can be automatically detected by the methods of Appendix F.
> You can encode your documents in (hypothetical) JOECODE, which uses leading
> 00 as a signature (ignored by the XML parser) and then A=01, B=02, C=03, and so on.
> Autodetection will not work here, but it is perfectly conformant to have
> a processor that understands only UTF-8, UTF-16, and JOECODE.
>
> Of course some encodings, such as US-BSCII, which looks just like US-ASCII
> except that A=0x42, B=0x41, a=0x62, b=0x61 will cause problems for anybody.
> :-)
>
> I am a member of, but not speaking for, the XML Core WG.
>
> --
> John Cowan jcowan@reutershealth.com www.ccil.org/~cowan www.reutershealth.com
> "The competent programmer is fully aware of the strictly limited size of his own
> skull; therefore he approaches the programming task in full humility, and among
> other things he avoids clever tricks like the plague." --Edsger Dijkstra
-- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
This archive was generated by hypermail 2.1.5 : Sat Nov 02 2002 - 19:50:33 EST