Re: FAQ proposal (was RE: Combining letters in Devanagiri)

From: Mark Davis (mark@macchiato.com)
Date: Fri Feb 22 2002 - 11:21:03 EST


Marco, that is a very nice FAQ; the only addition I would suggest is
to also point to

http://www.unicode.org/unicode/standard/where/

Mark
—————

Γνῶθι σαυτόν — Θαλῆς
[For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr]

http://www.macchiato.com

----- Original Message -----
From: "Marco Cimarosti" <marco.cimarosti@essetre.it>
To: <unicode@unicode.org>
Cc: <varada2707@yahoo.com>; <info@unicode.org>
Sent: Friday, February 22, 2002 07:21
Subject: FAQ proposal (was RE: Combining letters in Devanagiri)

> Varada wrote:
> > I am developing an uni code editor for Devanagiri and have a
> > clarification on combine letters in devanagiri.
> >
> > For Eg if have to form a word that like "PATNI" It should
> > have first
> > half of "PA" + "TA" + "NA" + "I" .
> >
> > So also if I have to form a word "HAMSA" it should have full "HA"
+
> > half "MA" + full "SA".
> >
> > I downloaded the Unicode 3.2 beta and could not find codes for
half
> > letters. Would like to know how are these supported in Unicode ?
>
> As this question has been raised and answered many times, and not
everybody
> has a copy of TUS or can read PDF files, I propose to paraphrase
Varada's
> question into a specific FAQ, to be added on
> <http://www.unicode.org/unicode/faq/indic.html>, possibly as the
first
> question.
>
> «
> Q: I cannot find on Unicode charts the "half forms" of Devanagari
letters
> (or any other Indic script). These characters are needed to form
words such
> as "patni".
>
> A: Unicode does not encode half or subjoined letters for the scripts
of
> India. Like in the ISCII standard, Unicode forms all "consonant
clusters"
> (such as the "tn" in "patni") by inserting the character "virama"
(or
> "halant") between the two relevant consonant letters.
>
> For instance, the Devanagari syllable "tna" ("त्न") is encoded with
the
> following code points:
>
> U+0924 (त DEVANAGARI LETTER TA)
> U+094D (् DEVANAGARI SIGN VIRAMA = halant)
> U+0928 (न DEVANAGARI LETTER NA)
>
> These three characters will be normally displayed using the single
glyph
> <tna ligature> ("त्न"). But it is also possible that they are
displayed
> using a <half ta> glyph followed by a <full na> glyph ("त्‍न"), or
even with
> a <full ta> glyph combined with a <virama> glyph and followed by a
<full na>
> glyph ("त्‌न")
>
> Which form will be actually displayed is the decision of an
underlying
> software module called "display engine", which bases this decision
on the
> availability of glyphs in the font.
>
> If the sequence U+0924, U+094D is not followed by another consonant
letter
> (such as "na") it is always displayed as a <full ta> glyph combined
with the
> <virama> glyph ("त्").
>
> Unicode provides a way to force the display engine to show a half
letter
> form. To do this, an invisible character called ZERO WIDTH JOINER
should be
> inserted after the virama:
>
> U+0924 (त DEVANAGARI LETTER TA)
> U+094D (् DEVANAGARI SIGN VIRAMA = halant)
> U+200D (zwj ZERO WIDTH JOINER)
> U+0928 (न DEVANAGARI LETTER NA)
>
> This sequence is always displayed as a <half ta> glyph followed by a
<full
> na> glyph ("त्‍न"). Even if the consonant "na" is not present, the
sequence
> U+0924, U+094D, U+200D is displayed as a <half ta> glyph ("त्‍").
>
> Unicode also provides a way to force the display engine to show the
<virama>
> glyph. To do this, an invisible character called ZERO WIDTH
NON-JOINER
> should be inserted after the virama:
>
> U+0924 (त DEVANAGARI LETTER TA)
> U+094D (् DEVANAGARI SIGN VIRAMA = halant)
> U+200C (zwnj ZERO WIDTH NON-JOINER)
> U+0928 (न DEVANAGARI LETTER NA)
>
> This sequence is always displayed as a <full ta> glyph combined with
a
> <virama> glyph and followed by a <full na> glyph ("त्‌न").
>
> For more detailed information, see Chapter 9 of the Unicode
Standard, "South
> and Southeast Asian Scripts"
> <http://www.unicode.org/unicode/uni2book/ch09.pdf>.
> »
>
> I don't know if all the glyphs in this e-mail will show correctly to
> everybody. However, I can provide GIF images for all the examples.
>
> _ Marco
>
>



This archive was generated by hypermail 2.1.2 : Fri Feb 22 2002 - 11:04:34 EST