Re: Numbered consonants in Tamil script abugida series

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Tue Jun 28 2005 - 16:12:03 CDT

  • Next message: Sinnathurai Srivas: "Tamil Collation - Analysis"

     Naga Ganesan wrote:

    >Richard Wordingham asked: << How are they combined with the vowel? Is it C
    >+ V + subscript/superscript digit in Unicode? >>

    > I always think Tamil script books in terms of Venn diagrams. Tamil books
    > with no Tamil Grantha letters dwelling in the innermost circle (the
    > letters defined in Tolkaappiyam and Nannuul grammars), There, Pure Tamil
    > letters only:
    > க், ங், ச், ஞ், ட், ண், த், ந், ப், ம், ய், ர், ல், வ், ழ், ள், ற், ன்

    > Next, is the circle, in most common use, something like Unicode Tamil code
    > chart plus addition for anuswaram and vocalic r. Of course, the outermost
    > circle in the Venn chart is the one with numbered 2,3,4 super- or
    > sub-scripts), vocalic RR, voclaic L, vocalic LL. This ensures
    one-to-one round trip transliteration between other Indic scripts and Tamil
    script. Stripping 2,3,4 will yield the next inner circle. Then there are
    well defined rules to convert all grantha consonants into 18
    "Pure Tamil" consonants to reach the innermost circle of the Venn diagram,
    if the user needs/desires it.

    > Coming to your question of how the numbered consonants and their corr.
    > abugida series work,
    please check the Vaishnava slokams page at:
    http://www.prapatti.com/slokas/slokasbyname.html

    Thank you for the source of examples. I do have a number of comments to
    make:

    1. Do vocalic R / RR / L / LL round trip? Vocalic R seems to be merged with
    the sequence <r><u>, and so will not *round* trip.

    2. I couldn't fine any examples of anusvara in Tamil - in all the examples I
    looked at, it was simply written as ம்.

    3. Those texts have Tamil Grantha anusvara (two dots). I did not notice any
    evidence that the texts labelled as Tamil were not in the Tamil script. (I
    do not regard having extra symbols as evidence - I believe English and
    Polish are written in the same script.) It seems that someone ought to
    propose TAMIL GRANTHA ANUSVARA for the Tamil block (or TAMIL TRUE ANUSVARA
    if the former creates a name conflict), perhaps with the annotation 'the
    real anusvara'. (I'm not sure that that needs to be a combining mark
    either. It might be simpler all round if this were tagged Lo.)

    4. The subscript '2', '3' and '4' defy useful abstract analysis. They
    follow the connected glyph portion containing the consonant, preceding the
    glyph of VOWEL SIGN AA or AU LENGTH MARK. There seems to be no way to
    represent them in combination with those glyphs using Unicode! Can anyone
    see how (short of burying our heads in the sand) we can avoid adding at
    least combining marks TAMIL VARGA MARK TWO, TAMIL VARGA MARK THREE and TAMIL
    VARGA MARK FOUR? <vowel, varga mark> and <varga mark, vowel> will be
    canonically inequivalent. The order <varga mark, vowel> seems more logical,
    but <vowel, varga mark> gives a renderer less re-ordering to do. Ideally a
    renderer and a collating sequence should decline to distinguish the two
    orders.

    Should there also be a TAMIL VARGA MARK ONE? I thought I'd read of
    superscript 1 being used for completeness. It makes sense intervocally as a
    way of saying neither voice nor geminate.

    Is there any reason for unifying these varga marks with numeric tone marks?

    5. I still don't know how these subscripts or superscripts should affect
    collation!

    > The last column has the numbered consonants text of many Vaishnava
    > slogans. They will have
    most of abugidas and the way numbered consonants are employed.

    Richard.



    This archive was generated by hypermail 2.1.5 : Tue Jun 28 2005 - 16:13:58 CDT