On Fri, 10 Dec 1999, Mark E. Davis wrote:
> 1. You give no values for the range 00 to 7F. As it stands, this means that
> these characters are undefined in the standard, and would map to FFFD. I
> suspect, on the other hand, from the text that these are really ASCII clones.
> If so, you should define them explicitly, as in
> ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-6.TXT.
Thank you very much of noting this. The standard defines those to be same
as ISO 646 characters which is of course ASCII. I will correct the tables.
> 2. However, if this is done, then the situation is still quite odd. The
> Persian standard would then not round-trip to and from Unicode since both 7F
> and FF would map to 007F. I am curious just as to how a Persian DELETE would
> differ from an ASCII one.
Yes. There are problems with round-trip. Not only these, but also there
are copies of SPACE, COLON, PARENTHESIS, etc. in the upper part
(0x80--0xFF). I agree that this is not good practice, but this was done
for simplifying the BIDI algorithm, thus avoiding any Neutral character
type. I think this is against the guidelines of ISO 2022, that
the standard claims to be based on, but I'm not sure.
BTW, of course ISIRI-3342 designers did not know they should make their
character set in a way to be round-trip-able with Unicode. ;)
Any suggestions for round-trip is welcome, but there are even more
problems. For correct conversion of ISIRI-3342 to Unicode, a BIDI and a
reverse-BIDI are also required.
--Roozbeh
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT