Re: Medievalist ligature character in the PUA

From: verdy_p (verdy_p@wanadoo.fr)
Date: Mon Dec 14 2009 - 20:17:52 CST

  • Next message: John H. Jenkins: "Re: Medievalist ligature character in the PUA"

    "John H. Jenkins"
    > The Latin ligatures that are already there are for round-trip compatibility *only*.

    NOT *only*. There are ligatures that were encoded because they are considered as unbreakable letters in some
    languages or as unbreakable symbols. In which case they are treated as distinct.

    See æ (from an old ligature of "ae"), Æ (from an old ligature of "AE"), œ (from an old ligature of "oe"), Œ (from an
    old ligature of "OE"), & (from an old ligature of "et"), ß (from an old ligature of "ſs" or "ſz").

    For these, you cannot convert them to letter pairs, not even when using ZWJ between them.

    On the opposite, I'm not sure that "ij" and "IJ" are completely unbreakable (even modern Dutch today consider them to
    be breakable and representable as letter pairs (with the ZWJ ligature hint), given that it has become widespread to
    write Dutch words without them.

    On fact you may also consider the German letters "ä", "ö", "ü" (with "Umlaut") also as modern ligatures (of "ae",
    "oe" or "ue"): the German Umlaut does not really share its identity with the dieresis which has a very different
    origin and meaning.

    But if you consider medieval texts, you will also have to consider the case of accents : in many cases, letters with
    accents were ligated forms, originating from abbreviation conventions: the accents progressively evolved from
    abbreviation marks used when some letters could be easily omitted (not essential for reading) as they had become
    almost mute or were slightly reduced in length or had merged in the phonology with previous letters whose phonology
    has evolved (such as modification of length or value or stress).

    Many of these accents were still kept for etymological reasons, untel thy evolved as distinct marks for the newer
    phonology. And their link to etymology became less evident or simply wrong: it was no longer possible to decompose
    letters with accents into letter pairs, so the accents became distinctive in the alphabets in which they are now
    used.

    The same is true for almost all other Latin diacritics (including those attached below the letters like the cedilla
    or written in overlay like the solidus). In Medieval texts, many of them will be found in various places as
    abbreviation marks similar to the tilde (and on varying positions depending on authors or publishers: above the
    letter, across it, attached to the left, without precise rules): it will be difficult to decide if they are true
    ligatures or if they are noting a new letter. For this reason, the medieval abbreviation marks and ligatures should
    be encoded specifically (and using ZWJ will not be a solution as it is a too weak indicator, just an hint)



    This archive was generated by hypermail 2.1.5 : Mon Dec 14 2009 - 20:19:24 CST