RE: In defense of Plane 14 language tags (long)

From: Marco Cimarosti (
Date: Tue Nov 05 2002 - 08:27:33 EST

  • Next message: Lukas Pietsch: "Re: Special characters"

    John Cowan wrote:
    > Marco Cimarosti scripsit:
    > > { As a side note, the idea that a language my use "foreign"
    > words seems
    > > terribly naive to me. It is true that, in Italian, we use
    > loanwords such as
    > > "hardware", "punk", or "footing", but it would be silly to
    > consider or tag
    > > them as "English words". They are genuinely Italian words, [...]
    > In English, however, the distinction between borrowings and truly
    > foreign words does make sense. Such a word as
    > Weltanschauung, for example, [...]

    It seems that *I* was terribly naive... This distinction does indeed make
    sense for Italian too: when we occasionally use English phrases like "last
    but not least" or "the American dream", we do approximate the English
    pronunciation as good as we can.

    > Even in Italian, what about Latin terms embedded in classic poetry?
    > Are you going to say that those too are Italian, just with a slightly
    > peculiar morphology?

    This is a bad example: in poetry and in prose, Latin terms and names are
    normally heavily adapted orthographically and morphologically. E.g., the
    Italian for "amphorae" is "anfore"; the Italian for "Julius Caesar" is
    "Giulio Cesare", etc. Even terms not adapted orthographically, like "Regina
    Coeli" are heavily adapted phonetically (/re'dʒina 'tʃeli/).

    > Plain-text tags don't nest, however: you need to give a tag explicitly
    > naming the outer language when you return to it.

    I was talking about nesting plain-text language tags into *rich*-text
    language tags.

    > > If they are rendered as invisible glyphs, they make the
    > text more difficult
    > > to edit and to move the cursor within, because the user
    > will have no way of
    > > understanding why the cursor stops twice in apparently
    > random positions.
    > > This also exposes the information contained in language tags to be
    > > unwillingly corrupted by subsequent editing.
    > This argument proves too much: it applies with equal force to the
    > invisible bidi controls and the other Unicode controls. In practice
    > these things are not available for plaintext-style editing except in a
    > "reveal controls" mode, which could equally well reveal the tags using
    > some stylized glyphs.

    This doesn't seem a valid argument. Perhaps, those other things deserve to
    be deprecated as well. Or perhaps they are so important that they are worth
    the trouble.

    Talking specifically about the bidi controls, there are a few intrinsic
    differences from Plane 14 language tags:

    - The meaning of bidi controls is clearly defined in the Unicode standard,
    and this meaning is mandatory unless a higher level protocol is in action.
    The exact interpretation of plane 14 language tags is entirely left to the
    application, so they are just a standard mechanism to define higher level
    protocols: they have no meaning without a higher level protocol.

    - The bidi controls exist to address (rare, but existing) problems which
    would affect the basic readability of the text: specifically, they address
    cases in which the apparent reading order would be different from the actual
    logical order. Plane 14 language tags mainly address an esthetic/political

    _ Marco

    This archive was generated by hypermail 2.1.5 : Tue Nov 05 2002 - 09:09:55 EST