RE: Generic Base Letter

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Jun 27 2010 - 03:54:25 CDT

  • Next message: Vincent Setterholm: "RE: Generic Base Letter"

    I don't know what Microsoft does, but at least, combining 25CC with a
    combining diacritic DOES work in current versions of Internet
    Explorer.

    But as it is known that this could cause a problem, for example when
    rendering charts on the web, a simple solution generally adopted
    involves the use of a more natural arbitrary base character, and some
    other presentation style (such as colored backgrounds).

    See examples like there (diacritics are shown with a natural base
    character, but a consistant blue background for all tables):

    - http://fr.wikipedia.org/wiki/Table_des_caract%C3%A8res_Unicode/U0300
    (it uses the Latin letter 'o' for diacritics used with the Latin script)

    - http://fr.wikipedia.org/wiki/Table_des_caract%C3%A8res_Unicode/U0590
    (it uses the Hebrew letter SHIN for all Hebrew diacritics)

    - http://fr.wikipedia.org/wiki/Table_des_caractères_Unicode/U0600
    (another Arabic letter is used for all Arabic diacritics)

    And so on...

    Additionally, the controls are shown with a red background, and format
    controls are within a box with a dashed border. Unallocated codepoints
    are shown with a grey background. There's no risk of confusion with a
    true dotted circle symbol.

    But the Unicode and ISO/IEC 10646 charts (in PDFs or printed books)
    need to be monochrome, so instead of using distinctive color
    background, it's normal that they use a symbol that cannot be exactly
    similar to an encoded character.

    Philippe.

    "Vincent Setterholm" <vincent@logos.com> wrote:
    >
    > I've tried using 25CC. The problem I'm running into is that the font designer can make marks combine with 25CC just fine but then Microsoft simply ignores the look-up tables that shape these combinations and inserts their own dotted circle (or circles - one per combining mark) anyway.
    >
    > I don't know what effect using a 'symbol' for a letter has on indexing or searching or line/word breaking because I haven't even gotten so far as to get the display to look right, but I'm guessing there'd also be an advantage to such a character having letter semantics.
    >
    > This need to display marks, well-formed on a generic base, is a really common phenomenon. Countless grammars and other philology and linguistics books/articles/etc. have to represent these types of patterns. I think there needs to be an official solution for placing marks on a generic base that behaves like a letter - something documented so that future font designers can support this and so that the technology providers like Microsoft, ICU, etc. have clear directions on how to support this.
    >
    > If using 25CC really is the answer, then let's publish that solution as part of the Unicode Standard so that all font designers can follow this convention and so that we can have some hope of companies like Microsoft supporting the standard.
    >
    > ________________________________________
    > From: Otto Stolz [Otto.Stolz@uni-konstanz.de]
    > Sent: Saturday, June 26, 2010 8:03 AM
    > To: Vincent Setterholm
    > Cc: 'unicode@unicode.org'
    > Subject: Re: Generic Base Letter
    >
    > Hi Vincent Setterholm,
    >
    > you have been asking:
    > > What I'd like to see is a code point for a generic base character
    >
    > You could try U+25CC DOTTED CIRCLE, though the reference glyph
    > for this cgaracter is larger than the dotted circles used to
    > attach the various combining marks, in their respective reference
    > glyphs.
    >
    > Best wishes,
    > Otto Stolz
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Sun Jun 27 2010 - 03:58:25 CDT