From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Tue Jun 28 2005 - 16:12:03 CDT
 Naga Ganesan wrote:
>Richard Wordingham asked: << How are they combined with the vowel? Is  it C 
>+ V + subscript/superscript digit in Unicode? >>
> I always think Tamil script books in terms of Venn diagrams. Tamil books 
> with no Tamil Grantha letters dwelling in the innermost circle (the 
> letters defined in Tolkaappiyam and Nannuul grammars), There, Pure Tamil 
> letters only:
> க், ங், ச், ஞ், ட், ண், த், ந், ப், ம், ய், ர், ல், வ், ழ், ள், ற், ன்
> Next, is the circle, in most common use, something like Unicode Tamil code 
> chart plus addition for anuswaram and vocalic r. Of course, the outermost 
> circle in the Venn chart is the one with numbered 2,3,4 super- or 
> sub-scripts), vocalic RR, voclaic L, vocalic LL. This ensures
one-to-one round trip transliteration between other Indic scripts and Tamil 
script. Stripping 2,3,4 will yield the next inner circle. Then there are 
well defined rules to convert all grantha consonants into 18
"Pure Tamil" consonants to reach the innermost circle of the Venn diagram, 
if the user needs/desires it.
> Coming to your question of how the numbered consonants and their corr. 
> abugida series work,
please check the Vaishnava slokams page at: 
http://www.prapatti.com/slokas/slokasbyname.html
Thank you for the source of examples.  I do have a number of comments to 
make:
1. Do vocalic R / RR / L / LL round trip?  Vocalic R seems to be merged with 
the sequence <r><u>, and so will not *round* trip.
2. I couldn't fine any examples of anusvara in Tamil - in all the examples I 
looked at, it was simply written as ம்.
3. Those texts have Tamil Grantha anusvara (two dots).  I did not notice any 
evidence that the texts labelled as Tamil were not in the Tamil script.  (I 
do not regard having extra symbols as evidence - I believe English and 
Polish are written in the same script.)  It seems that someone ought to 
propose TAMIL GRANTHA ANUSVARA for the Tamil block (or TAMIL TRUE ANUSVARA 
if the former creates a name conflict), perhaps with the annotation 'the 
real anusvara'.  (I'm not sure that that needs to be a combining mark 
either.  It might be simpler all round if this were tagged Lo.)
4. The subscript '2', '3' and '4' defy useful abstract analysis.  They 
follow the connected glyph portion containing the consonant, preceding the 
glyph of VOWEL SIGN AA or AU LENGTH MARK.  There seems to be no way to 
represent them in combination with those glyphs using Unicode!  Can anyone 
see how (short of burying our heads in the sand) we can avoid adding at 
least combining marks TAMIL VARGA MARK TWO, TAMIL VARGA MARK THREE and TAMIL 
VARGA MARK FOUR?  <vowel, varga mark> and <varga mark, vowel> will be 
canonically inequivalent.  The order <varga mark, vowel> seems more logical, 
but <vowel, varga mark> gives a renderer less re-ordering to do.  Ideally a 
renderer and a collating sequence should decline to distinguish the two 
orders.
Should there also be a TAMIL VARGA MARK ONE?  I thought I'd read of 
superscript 1 being used for completeness.  It makes sense intervocally as a 
way of saying neither voice nor geminate.
Is there any reason for unifying these varga marks with numeric tone marks?
5. I still don't know how these subscripts or superscripts should affect 
collation!
> The last column has the numbered consonants text of many Vaishnava 
> slogans. They will have
most of abugidas and the way numbered consonants are employed.
Richard. 
This archive was generated by hypermail 2.1.5 : Tue Jun 28 2005 - 16:13:58 CDT