Re: Latin encodin model

From: =?iso-8859-1?Q?António MARTINS-Tuválkin?= (antonio@tuvalkin.web.pt)
Date: Mon Oct 27 2008 - 05:28:19 CST

  • Next message: Andreas Stötzner: "Re: Text scans needed containing slashed letters of 19/20th century Latvian and Sorbian orthography"

    On 2008.10.27, 06:10, Karl Pentzlin <karl-pentzlin@acssoft.de> wrote:

    > also for letters with any kind of "fixed" appendages which are not
    > attached simply at the bottom of a letter (like ogonek or cedilla).

    Even cases of oddly positioned "appendages", like U+0104 LATIN CAPITAL
    LETTER A WITH OGONEK, there is canonical compatibility (U+0041 U+0328) —
    it was only recently that connected and oversticken discriticals were
    encoded as separate character without canonical compatibility.

    This is clear in the more recently added cyrillic characters — going to
    the point that U+00E7 LATIN SMALL LETTER C WITH CEDILLA is canonically
    equivalent to U+0063 U+0327 and yet U+04AB CYRILLIC SMALL LETTER ES WITH
    DESCENDER is not, although in the real world these two were the very same
    lead type, taken from French sorts (later lynotype films).

    > This is something like the Arabic encoding model, where a model based
    > on ghost characters + combining marks could have been selected but in
    > fact was not.

    This is absolutely not the case. Nobody is saying that "Q" should be made
    canonically equivalent to U+004F U+0330 or some such.

    Waht is being argumented is that, for some reason, new Latin letter +
    diacritical pairs were not accepted as new characters for a long time
    (and rightly so), but recently an exception was made for connecting and
    oversticker characters.

    This seems to be a give-in to some technological problem, not to (as it
    should) to an actualy encoding philosophy improvement.

    In view of this, Karl's proposal should be accepted, of course, but the
    lack of compatibility for all these characters, and its assymetry with
    older cases (like the mentioned U+0104), still bugs me.

    -- 
    António MARTINS-Tuválkin                                            ____.
    <antonio@tuvalkin.web.pt>                                          |  ()|
                                             Năo me invejo de quem tem |####|
    PT-1500-111 LISBOA                       carros, parelhas e montes      |
    +351 934 821 700, +351 217 150 939       só me invejo de quem bebe      |
    ICQ:193279138  http://tuvalkin.web.pt/   a água em todas as fontes      |
    


    This archive was generated by hypermail 2.1.5 : Mon Oct 27 2008 - 05:34:29 CST