Re: Default rendering of Combining diacritical marks

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Mar 06 2006 - 10:37:31 CST

  • Next message: Philippe Verdy: "Re: [almost OT] Music score with RTL lyrics"

    From: "Andreas Prilop" <nhtcapri@rrzn-user.uni-hannover.de>
    > Instead of a combining dot
    > http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata03.html#x0323
    > you should use precomposed letters such as
    > http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata1E.html#x1E0C

    This is not possible for many letters with dot below, notably those used in African languages such as Amazight (Berber). And Unicode will NOT encode new precomposed combinations.

    So this is a bad advice: youshouldreally see all existing precomposed combinations as characters encoded for compatibility with past standards that included them either in composed or decomposed form, and where the normalization forms allow mutual compatiblity through Unicode. The Unicode model for diacritics is to use decomposed letters first, for alldiacritics that are detached from the letter.

    Unicode however does encode sometime new precomposed letters with overlay diacritics, because the existing non-spacing overlay diacritics are poor (not recommanded for litterary semantics, but only for special notations over symbols) and difficult to position on many letters where they adopt other forms and sizes, or because these non-spacing overlay diacritics exist in several sizes or orientation and there's no good rule to decide which one is correct over a given letter (such as overlay slashes or strokes), meaning that unification is impossible:

    What this means is that (for example) a precomposed L with slash is NOT unified (and NOT canonically equivalent) to a L followed by a non-spacing small or big slash (however it is possible to collate them together within the collation table...)



    This archive was generated by hypermail 2.1.5 : Mon Mar 06 2006 - 13:05:20 CST