From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Thu May 15 2003 - 10:19:19 EDT
Yael Aharon wrote:
> I see now why you thought the question was odd. I actually
> meant to ask about the various iso (e.g. 8859 variants) and
> windows character encodings.
OK, but those encodings do not "conform to Unicode specs": they are simply
different encodings, which can be *converted* to Unicode because Unicode
contains all the characters that they contain.
However, the answer to your question is "yes" for all ISO 8859 and Windows
encoding. However, it is "no" for most DOS encodings (which are still
sometimes used in Windows) and for some Japanese encodings (also used in
Windows in, e.g., Internet or e-mail).
You can check this from the mapping files found here:
http://www.unicode.org/Public/MAPPINGS
Each line in those files contains the mapping between a 3rd-party encoding
character (1st column) and Unicode (2nd column):
...
0x41 0x0041 # LATIN CAPITAL LETTER A
...
0xC7 0x0627 # ARABIC LETTER ALEF
...
You could do a quick script to check whether any 3rd-party character in
range 0x00 to 0x7F maps to a different Unicode value.
_ Marco
This archive was generated by hypermail 2.1.5 : Thu May 15 2003 - 11:08:52 EDT