Re: Unicode's Purpose/Goals [was: Re: Tamil 0B83: Tamil Aytham and Devanagari VisargaL]

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Fri Apr 01 2005 - 03:17:04 CST

  • Next message: Jukka K. Korpela: "Re: U+0023"

    Unicode's goal is to be the foundation of modern sofware i18n and related
    protocols.

    The more this goal is being realized, the more certain aspects of Unicode
    are being
    de-facto (and lately also officially) constrained in terms of stability.

    Character names were one of the first items to be officially subject to a
    stability
    policy (together with code locations), even though the formal description
    of that
    policy took some time to make it to the website.

    The reason is both that other standards are referring to characters by
    name, (and
    implementations refer to characters by code), but also because both names
    and codes
    are *arbitrary*. They are well-chosen to begin with, but tasts could
    change, and in
    light of many desires to 'improve' there is no other reason than stability
    to preserve
    what we have, since there's no unique 'best'.

    In other words, we could have decided (in the early days) to make names
    completely
    non-binding. If we had done that, we would be inundated with requests to
    'improve'
    the naming of characters. If we had allowed such improvements to go forward, we
    would now have a mess, in that no-one would be certain what the name of a
    character
    is. In the early days, the name of the AE ligature was changed from
    LIGATURE to LETTER.

    A seemingly innocuous change, but invalidating a lot of places in the text
    of the standard.
    Therefore, the early UTC said "enough" - even before stability became
    required for other
    reasons.

    Currently, we use aliases and other annotations to point out names that
    really are
    insufficient (such as the one for U+2118), and give alternate names for
    many othe characters.

    The (eventual) way forward might be a separate effort to define a set of
    'use interface'
    strings for character names - those may be regional, for example 'slash'
    instead of 'solidus'
    for the US version. However, this is a lot of work and other issues are
    more pressing right
    now. But at some time in the future, this could be an issue worth addressing.

    In the meantime, I'd like to add on a personal note that I find the kind of
    complaints
    that take snippets from the FAQ and the 5-line Unicode overwiew and try to
    fabricate
    contradictions, well I find those kinds of games silly and ridiculous.

    If there is a part of our web-site that needs improvement, we always
    appreciate feedback,
    (via the feedback form) and especially if it's accompanied by careful
    suggestion of
    possible replacement text. That's constructive criticism.

    A./



    This archive was generated by hypermail 2.1.5 : Fri Apr 01 2005 - 03:17:47 CST