Re: Arabic letters separated by markup

From: Peter Kirk (peterkirk@qaya.org)
Date: Sat Jun 11 2005 - 07:53:53 CDT

  • Next message: John Hudson: "Re: Arabic letters separated by markup"

    On 11/06/2005 02:17, John Hudson wrote:

    > Kent Karlsson wrote:
    >
    >> That does not really follow. I think "inline" tags *between*
    >> Arabic/Syriac/
    >> Mongolian letters (possibly with combining marks) can be seen as acting
    >> like
    >> ZERO-WIDTH JOINER for the purpose of Arabic/Syriac/Mongolian shaping. ...
    >

    This cannot be correct. Consider the sequence of Arabic DAL, followed by
    font markup, followed by Arabic HEH, as part of an Arabic script word.
    As DAL never joins to the left, if there is no markup here this should
    be rendered as isolated or final form DAL followed by isolated or
    initial form HEH. And this same joining behaviour should be preserved if
    parts of the word are to be rendered in different fonts, colours etc.
    But this sequence with ZWJ should be rendered as isolated or final form
    DAL followed by final or medial form HEH, which is certainly not what is
    required. The requirement should be that, for shaping purposes, the
    markup should be treated as completely transparent for the purpose of
    shaping, in the same way that combining marks and "most format control
    characters" are treated as transparent. In other words, they should be
    treated as in class T (not class C like ZWJ) in Table 8-3 in the Unicode
    standard p.199 (http://www.unicode.org/versions/Unicode4.0.0/ch08.pdf).

    I can see John's point that this might cause implementation difficulties
    where there is a change of font, but nevertheless this must be the
    correct behaviour as it preserves the generally correct appearance of
    the characters. It should be up to users will notice and correct for
    mismatches e.g. when glyphs in different fonts and sizes do not join
    correctly; it is not for Unicode to decide that because there may be a
    mismatch completely different glyphs should be substituted.

    >> ... Certain changes that the markup may result in, such as a size
    >> change,
    >> will
    >> make the join more or less "misfit" graphically. But whoever wrote the
    >> markup
    >> asked for a size change, not a joining change. Ligature formation should
    >> (always) be blocked over markup tags.
    >

    I agree. There seems to be a need to define markup as breaking
    ligatures, much as ZWJ and <ZWJ, ZWNJ, ZWJ> do according to Figure 15-2
    on p.391 (http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf). But
    this should be done in a way which is transparent to normal joining,
    which is unlike the behaviour in any of the columns of this table: the
    display for the last row should be as on the left column in the table,
    but for the preceding row as in the right column. The alternative must
    be to form the entire ligature as if in either the preceding or the
    following font.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    -- 
    No virus found in this outgoing message.
    Checked by AVG Anti-Virus.
    Version: 7.0.323 / Virus Database: 267.6.8 - Release Date: 11/06/2005
    


    This archive was generated by hypermail 2.1.5 : Sat Jun 11 2005 - 10:39:54 CDT