Re: Why wasn't it possible to encode a coeng-like joiner for Tibetan? from Christopher Fynn on 2013-04-14 (Unicode Mail List Archive)

From: Christopher Fynn <chris.fynn_at_gmail.com>
Date: Sun, 14 Apr 2013 13:10:27 +0600

On 14/04/2013, Shriramana Sharma <samjnaa_at_gmail.com> wrote:
> On Sun, Apr 14, 2013 at 2:32 AM, Christopher Fynn <chris.fynn_at_gmail.com>
> wrote:

>> The purpose of having most of these characters there was to facilitate
>> conversion between Tibetan and Devanagri scripts.

> Well conversion from Tibetan to Devanagari can easily be done even
> without these characters -- they only facilitate one-to-one mapping.

I agree - but some people thought they should be there for that purpose.

(Mind you I've never encountered anyone with the need to actually do this.)

Most of these characters were in earlier proposals for Tibetan which
mimicked that of Devanagri and the other ISCII derived encodings. They
kind of just got left in.

There were also proposals to have an invisible root consonant marker -
to flag the root consonant in a Tibetan tsheg-bar or syllable.

Other proposals wanted to have all the Tibetan consonants encoded +
combining super-added RA LA and SA (ra-mgo la-mgo sa-mgo) + subscript
YA RA and LA. This would mean having to type (or re-order) these head
letters after the base consonant. It might seem to have made Tibetan
collation easier - but to get that to work as intended, it would have
also been necessary to encode a PREFIX-GA, PREFIX-DA, PREFIX-BA,
PREFIX-MA and PREFIX-ACHUNG - all of which would have to logically
occur after the root cluster but be re-ordered before for rendering.

Anyway all these models completely fall apart as soon as you move away
from standard orthography for standard Tibetan words. All assumed
Tibetan neatly followed the orthographic rules found in Tibetan
grammar books - but this is not the case. First there are different
rules as to which letters can be combined with each other when writing
Sanskrit and other Indian languages - but there are still rules for
this. However these break down when transliterating words from other
languages (Chinese, English, etc.) into Tibetan. Next there are some
unusual combinations required when writing some other Tibeto-Burmese
languages in Tibetan script. Finally some Tibetan texts are full of
abbreviations which are written in a way which break all the standard
rules of Tibetan orthography and characters are combined in all sorts
of weird and wonderful ways. (See:
http://www.dzongkha.gov.bt/publications/PDF-publications/Duyig.pdf for
some examples.)

The encoding model finally adopted for Tibetan simply follows the way
Tibetans are taught to spell out combinations - and the way, and order
in which, they actually write. After all we were encoding a script
the way it is *actually* used - not encoding the rules of Tibetan
grammar or rules in books of orthography which tell you how the script
is supposed to be used.

- Chris
Received on Sun Apr 14 2013 - 02:14:49 CDT

This archive was generated by hypermail 2.2.0 : Sun Apr 14 2013 - 02:14:50 CDT