Re: Emoji: emoticons vs. literacy

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Sun Jan 04 2009 - 19:03:22 CST

  • Next message: Asmus Freytag: "Re: What exactly is a "representative glyph"? (referring to U+2591, emoji, and other characters)"

    On 1/3/2009 10:45 PM, Doug Ewell wrote:
    > Asmus Freytag <asmusf at ix dot netcom dot com> wrote:
    >
    >>> Seems to me that "compatibility characters" means whatever you want
    >>> it to mean at a given moment.
    >>
    >> I simply follow the definition. See, for example the glossary:
    >>
    >> "/Compatibility Character. /
    >> A character that would not have been encoded except for compatibility
    >> and round-trip convertibility with other standards"
    >
    > This definition also appears in Section 2.3 (p. 23) of TUS 5.0, but
    > the *very next sentence* says:
    >
    > "They are variants of characters that already have encodings as normal
    > (that is, non-compatibility) characters in the Unicode Standard; as
    > such, they are more properly referred to as compatibility variants."
    >
    > Now what?
    It's an attempt to separate the two facets of compatibility: One is
    based on interoperability needs being the primary base for encoding the
    character. The other is based on a character having a compatibility
    decomposition. The latter are the ones that could be called
    "compatibility variants", because they can be considered a variant of an
    existing (ordinary) character.

    (In discussions like this, I personally prefer the term "ordinary" in
    place of the more cumbersome circumlocution "normal (that is,
    non-compatibility)".)

    It should be immediately obvious, that not all characters needed for
    interoperability (compatibility) can be guaranteed to have an ordinary
    character counterpart. Therefore, some characters that look like
    ordinary characters (because they don't have a compatibility
    decomposition) are in fact encoded for compatibility.

    To make matters slightly more complicated, a huge number of characters
    that have compatibility decompositions represent both semantic as well
    as glyphic variation from a corresponding ordinary character. In
    notational systems where the glyphic variation applies to single
    characters, these act like ordinary characters. In texts not using such
    notational systems and where these variations apply to entire text runs,
    they should not be used, but can be replaced by the ordinary character
    plus style markup in rich text. (Examples include the phonetic
    characters, math alphanumerics etc.).

    The set of emoji (and also emoticons) are composed of many ordinary
    characters (straightforward symbols), plus compatibility characters that
    do not have a decomposition.

    Finally, I don't always distinguish between characters that should be
    "candidates for encoding as compatibility characters" and "compatibility
    characters" (which implies that they are actually encoded already).
    That's because it's usually clear from context whether characters are
    being proposed or are already encoded. This is just a matter of avoiding
    cumbersome expressions that are redundant from context - it does *not*
    imply that I consider these "as good as encoded". Just thought I might
    clarify that, while we are at it.
    >
    > Later:
    >
    >> What is it with you people? Everything apparently must be black or
    >> white. Character coding is an exercise in dealing with shades of gray
    >> and edge cases.
    >
    > At least now when I see a black-and-white statement such as "Unicode
    > does not encode idiosyncratic, personal, novel, or private-use
    > characters, nor does it encode logos or graphics," I know how to
    > interpret it.
    Yes, "graphics" is not a very well-defined term ;-) And "novel" would
    have encompassed the Euro sign before 2002, yet it was coded well in
    advance of the actual introduction of that currency.
    >
    > I've been a huge and vocal supporter of the Unicode Standard for the
    > past 16 years, back before most people had heard of it, and this is by
    > far the most disappointed I have ever been in the Standard. This
    > decision will come back to haunt Unicode again and again.
    First, there hasn't been a decision. Certainly not a final one. So it's
    a bit premature to express things this way.

    Second, if you've been around that long, you might have heard about
    similar discussions where people were predicting bad outcomes from
    certain decisions. Surprisingly enough, things didn't always turn out as
    badly as predicted. Some issues, after being hotly contested and taking
    truly enormous bandwidth in the committee, and on the lists, have sunk
    out of sight without a trace, the minute they were decided (and seem to
    have had no observable impact on the standard). Astonishing, but true.

    Third, I really hope that no single issue can affect your support for
    the standard, if it's sustained you for 16 years so far.

    A./



    This archive was generated by hypermail 2.1.5 : Sun Jan 04 2009 - 19:07:58 CST