Re: Eyelash Ra/Variant Mark?

From: Jeroen Hellingman (etmjehe@genesis.etm.ericsson.se)
Date: Thu May 28 1998 - 02:37:11 EDT


 
> > What is actually wrong with the UTC and ISO/IEC JTC1/SC2/WG2 actually
> > proposing to add a separate character for "eyelash ra"? Could
> > somebody explain why that is not a satisfactory solution?
>
> An addition of a separate character for eyelash ra is not a solution
> according to us-
>
> 1. It is neither a distinct consonant or a vowel by itself in the Marathi
> "alphabet". But marathi has some words where spellings are taught to be
> used with the eyelash-ra. Since Marathi uses normal ra also, and there
> are words which sound the same but are spelt differently in the two forms
> (eyelash-ra and reph), having different meanings, it was found acceptable
> to treat it as RRA. This served the purpose of spelling, display,
> sorting, phonetic correctness and usage. Hence eyelash-ra was
> implemented in ISCII as RRA-halant as per the chart.

This is the same using a separate character. From a implementation
point of view, I wouldn't consider reph as a half-letter. Its behaviour
is very much different from half-letters. For a spell-checking application,
using ZWJ may not be desirable, as such algorithms may be designed
to ignore such characters, and this will have to be added as an
exception. A separate eye-lash ra would be better in that case, even
though its full form will look the same as the ordinary ra. I wouldn't
map it on the Dravidian RRA, as that is a different letter. It can
be discussed whether a separate character has enough benefits above
ra + zwj.

> 2. It will cause difficulty in default sort order, if it is placed at the
> end of current assignments.

Recently, the a default Unicode collating algorithm was proposed
in a technical report. If that is going to be implemented (hopefully
as part of a standard Unicode API) code-points will not be very
relevant to sorting applications, so this is no objection to me.

> 3. It will merely treat it as a glyph variant for the Marathi script in the
> chart whereas ISCII/Unicode are meant to address the alphabet/character
> encoding mechanism. It will contradict the basic phonetic approach used
> for the design of character coding. This way a lot of alternate display
> forms or glyphs or conjuncts may need to be added in future and the code
> chart may only look like patchwork.

Agreed, but in this case, there is a non-contextual reason (spelling
of words) to use one or the other. However, I still think that using the
Unicode standard as it is, is sufficient for eye-lash ra.

> 4. If you are thinking of adding a character for it, then what is wrong in
> using the RRA U0931 which is already fitting neatly from all considerations.
> (Refer to my previous mails today demonstrating its usage and
> implementation in ISCII). It is not essential to treat all scripts of Unicode
> identically for rendition. Although DV(09XX) in Unicode provides symbols
> for transcribing tamil and other scripts, it is not a primary
> consideration when Devanagari has a seperate code space in Unicode.
> Priority must be given to Marathi and Hindi instead of the objective of
> transcription. Perfect transcription from other languages can still be
> achieved by other external software if need be.

In Unicode, you'll have to use some kind of table solution to
transliterate one script into another, and even then the transliteration
will not be perfect, although acceptable for the purposes you
describe, like rail-road reservation charts. I've been making such
tables, but correct transliteration requires knowledge of the
languages involved.

Jeroen



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT