In a message dated 2001-09-03 18:02:09 Pacific Daylight Time,
viranga@mds.rmit.edu.au writes:
> [XML], however, provides little information on existing CESs already
> in use for the interchange of Japanese characters. Such CESs are
> allowed as mere options among many others. Furthermore, [XML] says
> nothing about the appropriate CESs for each protocol (e.g. SMTP or
> HTTP) and those for information exchange files.
>
> The mapping between such existing CESs and [ISO/IEC10646]/[Unicode
> 3.0] is not specified either. Some mutually different conversions are
> in use, and thus different XML processors may emit different outputs.
I didn't think it was the purpose of the W3C or the XML specification to
define mapping tables between Unicode/10646 and other encodings. XML itself
supports the use of just about any ASCII- or EBCDIC-compatible encoding you
like, as long as you declare it in the XML header. Whether it gets
interpreted correctly, or at all, is up to the XML processor. Not every
processor will necessarily understand every encoding.
If there are two or more different mappings between Unicode/10646 and some
other encoding -- say, JIS X0208 -- then different XML processors certainly
may emit different outputs. That is not XML's fault, and it is not Unicode's
fault either. Unicode provides mapping tables to a wide variety of
encodings. I would use those if it were up to me.
-Doug Ewell
Fullerton, California
This archive was generated by hypermail 2.1.2 : Tue Sep 04 2001 - 00:33:05 EDT