RE: Character identities

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Fri Oct 25 2002 - 12:24:26 EDT

  • Next message: Magda Danish (Unicode): "FW: "Toned" Greek Capital Vocals"

    Marc Wilhelm Küster wrote:
    > At 14:04 25.10.2002 +0200, Kent Karlsson wrote:
    > >Font makers, please do not meddle with the authors intent
    > >(as reflected in the text of the document!). Just as it
    > >is inappropriate for font makers to use an ĝ glyph for ö
    > >(they are "the same", just slightly different derivations
    > >from "o^e"), it is just as inappropriate for font makers to
    > >use a "o^e" glyph for ö (by default in a Unicode font). Though
    > >in some sense the "same" they are still different enough for
    > >authors to care, and it is up to the document author/editor
    > >to decide, not the font maker.
    >
    > My wholehearted support!
    >
    > [...]
    >
    > For this reason it is quite impermissible to render the
    > combining letter small e as a diaeresis

    So far so good. There would be no reason for doing such a thing.

    If the author of a scholarly work used U+0364 (COMBINING LATIN SMALL LETTER
    E), this character should be displayed as either a letter "e" superscript to
    the base letter, or as an empty square (for fonts not caring about that
    character).

    > or, for that matter, the diaeresis as a combining
    > letter small e (however, you see the latter version
    > sometimes, very infrequently, in advertisement).

    This is the case I though we were discussing, and it is a very different
    case.

    Standing Keld's opinion and Marc's wholehearted support, it follows that
    those infrequent advertisements should be encoded using U+0364...

    But U+0364 (COMBINING LATIN SMALL LETTER E) belongs to a small collection of
    "Medieval superscript letter diactrics", which is supposed to "appear
    primarily in medieval Germanic manuscripts", or to reproduce "some usage as
    late as the 19th century in some languages".

    Using such a character to encode 21st century advertisements is doomed to
    cause problems:

    1) The glyph for U+0364 is more likely found in the font collection of the
    Faculty of Germanic Studies that on the PC of people wishing to read the
    advertisement for "Ye Olde Küster Pub". So, most people will be unable to
    view the advertisement correctly.

    2) The designer of the advertisement will be unable to use his spell-checker
    and hyphenator on the advertisement's text.

    3) User's will be unable to find the Küster Pub by searching "Küster" in a
    search engine.

    What will actually happen is that everybody will see an empty square, so
    they'll think that the web designer is an idiot, apart the professors at the
    Faculty of Germanic Studies, who'll think that the designer is an idiot
    because she doesn't know the difference between U+0308 and U+0364 in ancient
    German.

    The real error (IMHO) is the idea that font designers should stick to the
    *sample* glyphs printed on the Unicode book, because this would force
    graphic designer to change the *encoding* of their text in order to get the
    desired result.

    Another big error (IMHO, once again) is the idea that two different Unicode
    characters should look different. The difference must be preserved when it
    is useful -- e.g., U+0308 should not look like U+0364 in a font designed for
    publishing books on the history of German!

    What should really happen, IMHO, is that modern German should be encoded as
    modern German. A U+0308 (COMBINING DIAERESIS) should remain a U+0308,
    regardless that the corresponding glyph *looks* like U+0364 (COMBINING LATIN
    SMALL LETTER E) in one font, and it looks like U+0304 (COMBINING MACRON) in
    another font, and it looks like two five-pointed start side-by-side in a
    third font, and it looks like Mickey Mouse's ears in <Disney.ttf>...

    _ Marco



    This archive was generated by hypermail 2.1.5 : Fri Oct 25 2002 - 13:06:00 EDT