Re: 28th IUC paper - Tamil Unicode New

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sat Aug 20 2005 - 05:38:10 CDT

  • Next message: N. Ganesan: "Re: 28th IUC paper - Tamil Unicode New"

    From: "Richard Wordingham" <richard.wordingham@ntlworld.com>

    > I can understand the gripes about 'level-2' v. 'level-1' implementation,
    > though. I find it distinctly irritating that the newly added Tamil
    > consonant SHA U+0BB6 won't combine with vowels in Window XP, and seems
    > unlikely to unless one buys otherwise unneeded word processing packages.

    I understand that too: if one can demonstrate that Tamil is correctly
    handled using level-1 only implementation (yes this can be tested using
    PUAs) then it will establish the correct processing rules for handling Tamil
    the way it is encoded for now in Unicode.

    So it's up to Unicode to verify that the processing based on the current
    standard encoding is consistent with the level-1 implementation based on
    PUAs. This could be tested by using a mapping table between the two
    representations, and comparing the results between the level-2
    implementation with standard Unicode, and level-1 implementation with the
    "New Tamil" PUA block...

    But one must also verify that this will be consistent with the Indian ISCII
    standard for Tamil... (there may be a few quirks for exceptional cases
    normally absent of humane language, so it won't matter there.

    Another option would be to develop a "New Tamil" charset for test, and
    establishing a mapping table with ISCII (this will not require allocating
    PUAs). When this works, one can then define the correct mapping table
    between "New Tamil" and standard Unicode (without using PUAs!).

    Although I don't like the idea of publishing new 8-bit charset standards, it
    certainly helps when it allows reducing the number of cases to test and
    support for supporting correctly a script or language.



    This archive was generated by hypermail 2.1.5 : Sat Aug 20 2005 - 05:40:11 CDT