Re: ct, fj and blackletter ligatures

From: John Hudson (tiro@tiro.com)
Date: Sun Nov 03 2002 - 17:21:15 EST

  • Next message: Mark Davis: "Re: Names for UTF-8 with and without BOM"

    At 15:09 11/3/2002, Doug Ewell wrote:

    >This is what I am proposing be changed: fonts and/or rendering engines
    >(wherever the intelligence lies, depending on the vendor technology)
    >should be updated to recognize "letter + ZWJ + letter" (and similar
    >combinations of 3 or more letters) as a request to ligate the characters
    >if possible.
    >
    >I am *not* suggesting that fonts and rendering engines and intelligent
    >text processing tools like InDesign be stripped of all power to control
    >ligation. They are probably in an excellent position to do so. (I
    >wish, oh how I wish, that Microsoft Word had some facility for
    >generating ligatures.) And I am *not* suggesting that user overrides of
    >the default ligation behavior be limited to inserting ZWJ or ZWNJ. If
    >programs like InDesign give the user a convenient option to turn
    >ligation on and off, globally or locally, more power to them. What I am
    >suggesting is that the Unicode ZWJ and ZWNJ *also* be honored as a way
    >to control ligation. That is how I read the Unicode Standard.

    I basically agree with you, Doug, and my proposal for handling ZWJ ligation
    in OpenType would provide exactly what you describe, if implemented in
    fonts and supported by rendering engines. There are, however, a number of
    issues that need to be resolved. In order for a font lookup sequence
    involving ZWJ to be processed during layout, a *glyph* for the ZWJ
    character has to be painted in the glyph string, since font lookups work at
    the glyph level. Because ZWJ already had a function as a control character,
    e.g. in Indic script processing, prior to being pressganged into service
    for ligation, existing implementations do not paint a glyph for this
    character unless the user invokes an option to display control characters,
    e.g. in MS Office. In order to permit the latter option, these characters,
    if they are supported in a font at all, are represented by a special glyph:
    a vertical bar on a zero-width with a little x at the top. This obviously
    presents various problems, and should be a warning to the UTC to avoid
    repurposing characters that have already been implemented for other
    purposes: such implementations might not be compatible with the intended
    new purpose.

    So we have a quandry: do we stop treating ZWJ as a control character and
    always paint a glyph so that it can be used in lookup sequences? If we do
    this, we run the risk of a visble glyph appearing in text anywhere that a
    font does not provide a ligature glyph or lookup sequence. Do we avoid this
    by making the ZWJ glyph a blank, zero-width glyph? If we do this, we can no
    longer use current methods to provide users with the option of displaying
    control characters (I can think of various ways to solve this particular
    problem, including glyph substitution, e.g. a 'Control Display Forms'
    layout feature that would map the blank glyphs to visible forms). We also
    lose the ability to kern the glyphs on either side of the ZWJ if a ligature
    is not available (this could be solved with a lot of contextual kerning
    data, but that would be a serious pain). I'm not saying that any of these
    problems are insoluble, or that software developers should not rewrite all
    their existing rendering engines and rethink their approach to control
    characters in order to implement ZWJ ligation. I just think people should
    be aware that supporting ZWJ ligation is considerably more difficult than
    it would have been if, for example, Michael Everson's initial proposal for
    a separate Zero-Width Ligator had been accepted. Implementing something new
    is a lot easier than completely changing an existing implementation for a
    character whose purpose has suddenly been redefined. The more widely
    implemented Unicode becomes, the more the UTC will need to consider the
    impact of their decisions on existing implementations.

    John Hudson

    Tiro Typeworks www.tiro.com
    Vancouver, BC tiro@tiro.com

    It is necessary that by all means and cunning,
    the cursed owners of books should be persuaded
    to make them available to us, either by argument
    or by force. - Michael Apostolis, 1467



    This archive was generated by hypermail 2.1.5 : Sun Nov 03 2002 - 19:02:14 EST