From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Sun Dec 05 2004 - 11:15:23 CST
I fail to see the connection between your question and Unicode.
Samedi 4 décembre 2004 13:18Z, Rene Hache écrivit:
> To whom it may concern,
;-)
> I writing because I would to know if someone can help with certain
> Sanskrit/Pali characters in roman scripts.
Certainly there is a LOT of material this about around the net. Google is
certainly the best answer one can give to you.
As second level helper, it is my believeing that you will encounter more
material using Sanskrit as keyword than with Pali. This should not mislead
you: as always with Google and co., more material means overall more wrong
ways to check.
> Most characters are simple, like vowels with macrons, or some letters
> that have either a dot below or above.
If you want to see things this way, you should try a coded character set
that fit this description. Fortunately, such a thing exists, and a good
choice could be IS 13194:1991 widely know as ISCII; in this coded character
set, dha is only one codepoint (namely C5). ISCII is a good choice because
you can easily print it using ad hoc software (CDAC is a good keyword here),
and also because you can somewhat easily map from or to Unicode. Of course
collation, and translitteration to Nagari or other script used in India is
trivial, they were objectives of the design.
On the other hand, if you want to handle the textual material in Unicode (if
not, I really cannot see why you are asking this here), you will have to use
a not straightforward yet perfectly possible collating process. The fact
that dha is a single "letter" is not a real problem (this is a simple
contraction, any not stupid algorithm should offer this), more interesting
things appear when you realise that while dha is one letter, dhi are two.
Even more interesting is that in traditional order, ã (nasalisation noted
with candrabindu) precedes a (without nasalisation). And real complexity
begins when you study the rules to collate the anusvara (written as a dot
above in Nagari script, and which can stand for itself of for a nasal of the
following consonant).
Antoine
This archive was generated by hypermail 2.1.5 : Sun Dec 05 2004 - 12:55:06 CST