RE: Coloured diacritics (Was: Transcoding Tamil in the presence of markup)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Dec 07 2003 - 05:40:27 EST

  • Next message: Peter Jacobi: "RE: Transcoding Tamil in the presence of markup"

    John Hudson writes:
    > At 03:53 PM 12/6/2003, Philippe Verdy wrote:
    >
    > >Still this is an interesting problem: some texts for example want to
    > >exhibit some diacritics added to a base letter with a distinct color,
    > >notably in linguistic texts related to grammar or orthography.
    > >
    > >So for example you could want to exhibit the difference between the two
    > >French words "désert" and "dessert" by coloring the accent of the first
    > >word or the second s of the second; or even more accurately between
    > >"bailler" (concéder un bail, des baux) and "bâiller" (ouvrir en grand)
    > >where the presence or absence of the circumflex on letter 'a' is
    > >necessary to reflect the difference of both meaning and pronounciation.
    >
    > The way to do this is to decompose bases and marks at the glyph level if
    > they are not already decomposed at the character level, and then
    > to apply a colour to the mark.

    You're saying exactly what I said and I included an example, read further
    my message...

    The only problem is to find a way not only to decompose characters (this is
    easy), but to avoid creating defective grapheme clusters, whilst also
    maintaining the graphical composition layout (i.e. glyph positioning).

    Suppose that you wanted to color the middle vowel of a Hangul syllable
    cluster, you would be in big troubles as decomposing syllables and
    coloring jamos independantly would create separate strings that would
    not be easy to position relatively as the middle vowel jamo has a
    layout which depends on its surrounding consonnant jamos...

    Without a specific support in the (non-Unicode) style engine, solutions
    based on decompositions of characters may become tricky...

    Just one example, suppose that you want to color the circumflex above
    a lowercase i or above a uppercase A: the base letters have distinct
    widths (meaning that the diacritic has a different horizontal position),
    distinct height (meaning that the diacritic has a different vertical
    position), and a distinct contextual effect on its base letter (the
    base lowercase i should have its dot removed)...

    So decomposition is not the way to go, and characters should still need
    to be represented in a unbroken grapheme cluster, and the markup should
    be able to specify style on part of the grapheme cluster... Again here
    this is not a problem of Unicode itself.

    __________________________________________________________________
    << ella for Spam Control >> has removed Spam messages and set aside
    Newsletters for me
    You can use it too - and it's FREE! http://www.ellaforspam.com





    This archive was generated by hypermail 2.1.5 : Sun Dec 07 2003 - 06:41:52 EST