Re: Digraphs

From: Christopher John Fynn (cfynn@dircon.co.uk)
Date: Thu Feb 17 2000 - 21:09:49 EST


John Cowan <jcowan@reutershealth.com>

> Christopher John Fynn wrote:

> > How is it recommended to code Latin script digraphs that are used to
> > represent a single letter?

> > For example in Roman transliteration of Indic languages the digraph
"kh" or
> > "Kh" occurs with
> > a combining low line below (centred between the k and the h).

> > see:
> > http://ourworld.compuserve.com/homepages/stone_catend/trdis-4.htm

> Looks like <k> <combining low line> <h> <combining low line> to me.

The image isn't that good. In the fonts for Indic transliteration from
the Centre for Development of Advanced Computing (CDAC) in Pune
there are KH, Kh, kh, GH, Gh and gh digraph glyphs - all with a line below.
This line below is centred between the two letters and drawn
well inside the width of the paired letter glyphs.

In most fonts that have it the glyph for combining low line
seems to extend the full width of an average glyph bounding box
so using <k> <combining low line> <h> <combining low line>
would result in a line that is too wide. If it were drawn narrower
this would result in a gap in the line. Neither is really satisfactory.

I know this could be handled in an OpenType font by having
a precombined ligature glyphs for these string of characters -
you'd probably also want to use language tags to trigger the
glyph substitution since you might not want that behaviour
with other languages.

Since Sanskrit and Pali texts are quite frequently printed in
Latin script, I'd like to see some definitive text on how
these combinations should be represented with Unicode
when they occur in Indic language transliteration - otherwise
it's likely that people will use different combinations of
characters for the same thing.

BTW what is the use of U+2040 CHARACTER TIE?

- Chris



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT