From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Thu Feb 13 2003 - 10:47:52 EST
Andy White wrote:
> I think that Jim Agenbroad seems to have neatly come up with the
> solution, and if no one disagrees, this needs to be documented in TUCS
> or at least the Indic FAQ.
The Unicode Standard disagrees. TUS3.0, Chapter 9, page 214, Figure 9-3
("Conjunct Formations"), example (4) says that it should be encoded as
<U+0930 U+094D U+090B>:
"RAd + RIn -> RIn + RAsup"
That's absolutely intentional, as explained in the following paragraph:
"A number of types of conjunct formations appear in these examples:
[...] and (4) a rare conjunct formed with an independent vowel letter, in
this case the vowel letter RI (also known as vocalic r). Note that in
example (4) in Figure 9-3, the dead consonant RAd is depicted with the
nonspacing combining mark RAsup (repha)."
> He said that Devanagri Letter Vocalic R with Superscript Letter Ra
> (Vowel R with reph) should be encoded as "Ra + Vowelsign Vocaliic R"
> (u+0930, u+0943)
Sequence <U+0930 U+0943> has indeed the same meaning (i.e. pronunciation) as
the sequence above, but it has a different visual representation. See it in
TUS3.0, Chapter 9, page 222, Table 9-2 ("Sample Ligatures (Continued)"),
right-hand column, 4th row from bottom.
In this ligature, both U+0930 and U+0943 have their normal glyphs, but the
matra is joined in a unusual location (on the middle of the right side of
the letter, rather than below it)-
This visual representation actually exists (I have seen it often on Sanskrit
grammars), and is much more common that <independentRI + repha>.
> The answer to my original question "How then would you encode a visual
> U+0930, U+094D, U+090B" wil then be: "U+0930, U+094D, U+090B
> of course!"
That would be <U+0930 U+094D U+200C U+090B>, of course!! When you want to
force a visible virama, you insert a ZWNJ; why cluttering this simple rule
with meaningless exceptions?
_ Marco
This archive was generated by hypermail 2.1.5 : Thu Feb 13 2003 - 11:31:29 EST