Re: Arabic letters separated by markup

From: Richard Wordingham (richard.wordingham@ntlworld.com)
Date: Mon Jun 13 2005 - 19:08:03 CDT

  • Next message: Michael Everson: "Re: Arabic letters separated by markup"

    In reply to: http://www.unicode.org/mail-arch/unicode-ml/y2005-m06/0133.html

    > There seems to be no specification in regards to this problem with
    > changing inline text properties in the middle of a ligature, but I can
    > tell you how Internet Explorer seems to handle this color change problem.
    > What it simply does is, if one of the characters in the ligature (such as
    > the lam-alef ligature in Arabic) is colored differently then the whole
    > ligature is colored to that different color. Let's say if the "lam" in a
    > lam-alef ligature is red while the default text color is black, the whole
    > lam-alef ligature ends up being colored red. I think this is the next best
    > solution to being able to color part of the ligature red.

    > Even then I think this is still inadequate. It should be made possible
    > with font technology to color parts of a ligature in a different color.
    > This can be perhaps done by layered painting. The ligature is first drawn
    > in black by the font engine as usual and then an additional process paints
    > the part of the ligature in red. Although I don't know if OpenType is
    > capable of doing this kind of stuff.

    The information is there, in the data for cursor positioning data in the
    GPOS data for ligatures in the font. (This does seem to depend on a
    ligature always being composed of the same number of elemental glyphs, which
    must cause complications with superfluous ZWJ). Of course, that does assume
    that the logical order of constituents matches the visual order, which isn't
    true for lam-mim.

    The issue isn't restricted to explicit 'ligatures', even to the common
    OpenType case of letter plus accent being replaced by pre-composed form.
    The example http://www.qsm.co.il/Hebrew/HebrewTest/ColorHtml.htm shows the
    problem for accents on Latin letters - no colour change for IE 6.0 (Windows
    XP SP2, at least) and missing accents for Mozilla 1.7.8 (at least, in
    Firefox 1.0.4). I get the same behaviours with Thai superscript and
    subscript marks, but in Mozilla the marks appear, and in the specified
    colour, if a Thai consonant follows without any intervening mark-up. (I'm
    using the Tahoma font for Thai.) IE 6.0 insists on displaying the entire
    vertical stack with the same colour. Word 2002 (under Windows XP) behaves
    the same with these marks, even though RTF record the superscript vowel as
    being a different colour.

    In my Thai example (intended as a table of graphemes, not as an experiment),
    the marks were assigned the default colour (black), and the consonants were
    given the 'other' colour red. IE 6.0 gave the marks the unintended colours,
    not the consonants. (I'm not sure what Microsoft mean by 'base characters'
    for Thai - their documentation suggests that vowel marks are excluded, but I
    suspect that the preceding and following vowel marks are also base
    characters, albeit ones incapable of bearing marks.)

    The Latin accents and Thai marks being missing may be a Mozilla _bug_. I
    see no justification for a following consonant determining whether a Thai
    mark is displayed.

    The IE 6.0 and Word 2002 behaviours may be independent. IE 6.0 treats sara
    am as being nikkhahit plus sara aa, but Word 2002 treats is as a superscript
    mark. Both treat sara aa as an independent character.

    Of course, one work-around for the colour problem is to use a 'hack' font -
    the hack fonts I use for the Lanna script and for Burmese are unaffected by
    this problem.
    Perhaps this also makes the matter a Unicode issue. (The Burmese in a
    Unicode font is also unaffected, but this is probably because Uniscribe does
    not yet support Burmese. However, without Uniscribe support, it doesn't
    render properly anyway!)

    Richard.



    This archive was generated by hypermail 2.1.5 : Mon Jun 13 2005 - 19:11:56 CDT