Re: Combining Class of Thai Nonspacing_Marks

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Wed, 5 Apr 2017 05:37:08 +0100

On Wed, 5 Apr 2017 10:45:43 +0700
"Gerriet M. Denkmann" <gerrietm_at_icloud.com> wrote:

> > On 4 Apr 2017, at 23:51,Richard Wordingham
> > <richard.wordingham_at_ntlworld.com> wrote:

> > The order of MAITAIKHU and tone mark is significant - it should
> > affect rendering.

> Most fonts disagree (exception: Tahoma and Microsoft Sans Serif). Are
> there minority languages where the order has really a semantic
> meaning?

I think not. Most fonts are incompetent at displaying typing errors.

> Could one create a list of all possible combinations of non-spacing
> marks for Thai, minority languages and languages written using Thai
> characters (e.g. Pali, Sanskrit, Khmer, Burmese, etc.)? Including
> cases, where the order of these marks has a semantical meaning.

> The next step would then to agree on rules of normalisation.

Most of the 'normalisation' is straight forward.

1) Repeatedly swap mark above and following mark below.
2) Apply Unicode normalisation.

Then
3) Use a font that uses mark-to-mark positioning on all combinations of
vowels above and all combinations of vowel below.

NIKHAHIT followed by SARA AA needs special handling. I am not sure
how well the general case will work - particularly with fonts that do
their own reordering.

You also need to decide whether to fold <SARA I, NIKHAHIT> and <SARA
UE>. I've started to see fonts make an artificial distinction.

You may wish to note that it can be very hard to tell the difference
between U+002D HYPHEN-MINUS and U+2013 EN DASH in file names.

Richard.
Received on Tue Apr 04 2017 - 23:37:29 CDT

This archive was generated by hypermail 2.2.0 : Tue Apr 04 2017 - 23:37:29 CDT