RE: Suggestions in Unicode Indic FAQ

From: Keyur Shroff (keyur_shroff@yahoo.com)
Date: Wed Jan 29 2003 - 09:29:09 EST

  • Next message: Kent Karlsson: "RE: Indic Devanagari Query"

    --- Marco Cimarosti <marco.cimarosti@essetre.it> wrote:

    > Why not representing INV with a double ZWJ? E.g.:
    >
    > ISCII Unicode
    > KA halant INV KA virama ZWJ ZWJ
    > RA halant INV RA virama ZWJ ZWJ (i.e., repha)
    > INV halant RA ZWJ ZWJ virama RA (RAsub)
    >
    > This has the advantage that the most common sequences will work OK also
    > on
    > old display engines implemented *before* the double-ZWJ convention is
    > introduced.
    >
    > E.g., sequence "KA virama ZWJ ZWJ" works well also on an old engine, for
    > the
    > simple reason that the first ZWJ is enough to do the work, and the
    > second ZWJ is invisible.
    >
    > Of course, an old engine will still display a <RA[eyelash]> for <RA
    > virama
    > ZWJ ZWJ>, but that is not worse than displaying <RA+virama> followed by a
    > white box, which is what would happen with your new INV character.

    Certainly. This looks more promising because even RAsub has two alternate
    forms. One form is used with consonants KA, KHA, GHA, etc and the other
    form is used with consonants TTA, TTHA, DDA, DDHA, etc. With your ZWJ based
    scheme we can insert as many ZWJ as we wish to produce all possible
    alternate forms!

    But sometimes a user may want visual representation of these symbols in two
    different ways: with dotted circle and without dotted circle. Example of
    this could be RAsup on top of dotted circle and RAsup on top of space
    character. Current use of space character to eliminate dotted circle is
    really painful and may create problems in determining language and syllable
    boundaries. The main problem with space character is that unlike
    ZWJ/ZWNJ/Dotted Circle, it falls within the range of other important script
    "Latin". Finally it may affect all important text processing which uses
    Unicode characters to find language boundaries. Use of INV character in one
    shot can solve all these problems. We can put it in "consonant" class which
    can help text processing applications. Moreover, it will be difficult for
    all possible to provide upward compatibility all the time even though it is
    desirable. Implementation of Unicode will need to be upgraded with every
    introduction of new glyphs or rules. Otherwise applications have to
    explicitly declare the version of Unicode used in implementation.

    - Keyur

    __________________________________________________
    Do you Yahoo!?
    Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
    http://mailplus.yahoo.com



    This archive was generated by hypermail 2.1.5 : Wed Jan 29 2003 - 10:09:26 EST