Alain stated:
> But I am happy that you are in favour of a practical and safe solution for
> 8 bit character sets. The ISO/IEC 8859-15 (Latin 0) is the safest one, the
> only that is safe, in fact, and very practical, infinitely more practical
> than any other solution. Any other one so far, even with tagging, is
> absolutely not reliable.
>
> Don't forget EBCDIC to Windows to ISO-8-bits to UNICODE and back,
> preserving data integrity at all steps. Only Latin 0 will be able to
> achieve this neatly and cleanly, in a standard way.
>
Unfortunately this is *not* the case.
The most widely used EBCDIC code pages (CP037 and CP500) have already
had their repertoires carefully matched to ISO 8859-1. Round-tripping
to 8859-15 will result in EBCDIC characters that do not convert to
8859-15 and 8859-15 characters that do not convert to EBCDIC. Pushing
for widespread switching from 8859-1 to 8859-15 will *worsen* data
integrity for EBCDIC conversions. Furthermore, these 8859-1 converged
EBCDIC code pages (there are others besides CP037 and CP500) have
no code values open to add the Euro. They are in the exact same bind
that the 8859-x series are in--no provision for incremental expansions
to deal with something like the Euro. The Euro has the potential to
cause as much mischief in the EBCDIC world as ASCII bracket characters
used to.
And Windows 1252 has additional characters that are not in 8859-15.
Yes, the EURO SIGN can be converted between the newest version of
Windows 1252 and 8859-15, but there are other characters in 1252
which will still not convert to either 8859-x or to common EBCDIC
code pages.
This is hardly a recipe for neat and clean preservation of data
integrity.
I see the 8-bit standards world brewing up a world of hurt.
Unicode, anyone?
--Ken
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT