Emoji (was: Re: Unicode block for programming related symbols and codepoints?)

Doug Ewell doug at ewellic.org
Tue Feb 10 11:00:17 CST 2015

Shervin Afshar <shervinafshar at gmail dot com> wrote:

>>> The issue is with your very rigid interpretation of the criteria for
>>> encoding new symbols. Is "appearing in an industry character set
>>> extension" an official phrasing that you keep referring to?
>> It was either from the WG2 Principles and Procedures document, or
>> some other bit of Unicode/10646 folklore that I've read over the past
>> 22 years of keeping up with Unicode/10646. I should look up the exact
>> wording.
> Yes, please. I would like to have that policy noted for my future use.

I hadn't said, of course, that no new symbols could ever be encoded
unless they appeared in an industry character set or extension.

I was responding to a point that Frédéric Grosshans made [1] about
these symbols being added for compatibility with Japanese telco usage.
That argument could be used for the original emoji set, but not for new
emoji; those are supposed to follow the regular criteria.

[1] http://unicode.org/pipermail/unicode/2015-February/001246.html

Here is a passage from TUS 7.0, Section 2.3 that may shed light:

"Conceptually, compatibility characters are characters that would not
have been encoded in the Unicode Standard except for compatibility and
round-trip convertibility with other standards. Such standards include
international, national, and vendor character encoding standards. For
the most part, these are widely used standards that pre-dated Unicode,
but because continued interoperability with new standards and data
sources is one of the primary design goals of the Unicode Standard,
additional compatibility characters are added as the situation warrants.

"Compatibility characters can be contrasted with ordinary (or
non-compatibility) characters in the standard—ones that are generally
consistent with the Unicode text model and which would have been
accepted for encoding to represent various scripts and sets of symbols,
regardless of whether those characters also existed in other character
encoding standards."

> It's not about encoding what "they" please. Compatibility was the
> issue with the first set of emoji symbols. The rest of symbols are
> being added for various other reasons; e.g. diversity, parity,
> requests, etc.

Right. So the "compatibility with Japanese telcos" argument cannot be
used here.

> Also, random JPEG and meme don't apply here and you're mistaken to
> assume that GChat and Facebook fit in this category.

If you look at the set of new emoji proposed in L2/15-054 [2], you'll
see that quite a few of them are justified by their current popularity
on the Web. ("Selfie are very popular" was kind of striking. I guess at
least one of my predictions was right.)

[2] http://www.unicode.org/L2/L2015/15054r-emoji-tranche5.pdf

>> Great. Go ahead and encode them, UTC. But don't say it's because your
>> hands are tied and you have no choice.
> Quoting an official UTC communication?

Quoting an off-list remark.

> For a longer while now, some folks tend to use emoji as means to an
> end other than what is in the scope of conversation regarding emoji.
> And that is not acceptable.

Sorry, I don't understand this.

Doug Ewell | Thornton, CO, USA | http://ewellic.org

More information about the Unicode mailing list