From: N. Ganesan (naa.ganesan@gmail.com)
Date: Sun May 01 2005 - 08:54:27 CDT
Unicode code points of Tamil Grantham conjunct SRI
--------------------------------------------------------------------------------
Some may recall in the list about the last month discussions on
Visarga and Aaytham,
the intricate relationship between them as recorded in scholarly publications,
and even the word itself, aaytham deriving from a visarga term, aa'srita
of Sanskrit, aaytham is quite different from aayutham 'weapon' etc.,
Mentions and mails
with unattested words like VisargaL etc., seem to have abated.
-------
Likewise, it was felt essential to tell about the basic Unicode code
points of Sanskrit
term, SRI as used in all of India, and its Tamil Grantham codepoints.
The Unicode-accepted proposal on sha (U+0bb6) correctly
identifies SRI as being <0BB6, 0BCD, 0BB0, 0BC0>. It mentions SRI
ligature being made up of U+0bb6 prominently:
http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2617.pdf
Section 2.3 explicitly mentions the use of U+0bb8 in SRI ligature as
*incorrect*.
The review document with WG02 (Unicode) document
number is n2618,
http://wwwold.dkuug.dk/JTC1/SC2/WG2/docs/n2618
which talks about SRI and its component sha (0bb6).
The WG02 document clearly specifies why
sha (0bb6) is needed for Tamil:
"ISCII included letters for {Ss}, {s}, {h}
but left out the letter for {sh} in Tamil. This
resulted in a major deficiency in the code
- for instance, there is no way of representing
the backing string of a very important 'akshara' in
the language viz., {SRI}".
I hear often that sometimes SRI is written
differently. Yes, 100% agreed. Tamil nativizes the borrowed
loan words and letters of Sanskrit Grantham letters,
conjuncts differently. In fact, it is one yardstick
used by linguistics specialists to show that
a particular word is a borrowal in a language.
Take the conjunct, kSha (Thank God, Unicode
does not give it a separate code point unlike
hacked encodings). kSha is tamilized in various
ways: -kk-, -cc-, -Tc- and so on, with additinal
operative rule that word initially, kSha- will
become k-, or c-. Likewise, Sri ligature is tamilized
in many ways: eg., tiru or cirii (long standing usage.
See Azhvar paasuams) or something else.
But these nativization attempts differ from
person to person, time to time, district to district.
In English script, SRI conjunct is written in mulptiple
ways: sri, srii, sri_with_a_macron, s(acute)ri(macron),
sree, shree. shrii, shri, ... As we know well, Tamil script
also can do different attempts at nativization of the
loan word, SRI from Sanskrit. Like cirii, cii (ciitaran,
ciivalappEri, a town in Tinnevelly dist. a movie
was ciivalappEri paaNDi. ciivalan < srivallabhan),
siri, sirii, ... all these r can be replaced with R by some,
also s(0bb6) can also be replaced with 0bb8, 0bb7 and so on.
So many combinations and permutations, a bewildering array, is
possible. These tamilizing attempts can be seen in nonconjunct
and conjunct ksha also: -Tc-, -kk-, -cc-, with additional
operative rule that word-initial consonants in Tamil
words will be elided.
I wrote a letter to Sri. Kalyan, (Project Madurai)
explaining the need to use the
correct code point for the Tamil
Grantha ligature, SRI as
<U+0BB6, U+0BCD, U+0BB0, U+0BC0> ,
http://www.services.cnrs.fr/wws/arc/ctamil/2005-04/msg00034.html
These code points and their equivalents
are used not just in Tamil but through out
India to produce the conjunct Shree
(whatever the Indic script may be).
Hence, *definition* of Sanskrit Grantham ligature:
SRI = <0BB6, 0BCD, 0BB0, 0BC0>
This is used all across India.
Hence, my recommendation is to use this long standing
usage in the future documents.
Hope this helps,
Naga Ganesan, Ph.D.
This archive was generated by hypermail 2.1.5 : Sun May 01 2005 - 08:56:18 CDT