From: Michael D'Errico (mike-list@pobox.com)
Date: Thu Jan 08 2009 - 12:47:21 CST
>> The short answer is that *everyone* benefits from having
>> a standard that promotes interoperability of text interchange
>> globally without data corruption.
>
> ... is this about _text_ interchange, and specially plain text?
The limitation of Unicode to plain text is actually just a policy.
The emoji may not be text, but they do communicate an idea. Unicode
should be about enabling communication, not just that communication
which happens to use fonts. (Note I'm not saying Unicode should be
used for all forms of communication, but text-ness should not be an
absolute requirement.)
Unicode is two orthogonal things: a technology for representing a
series of numbers (all the UTF's), and a partial mapping of number
to plain-text character. The fact that all assigned numbers so far
are plain-text characters is the result of the original design goal
for Unicode, which as you know was to create a universal character
set to subsume all others.
The private use areas in Unicode present a problem in that there is
absolutely no limitation placed on what those numbers can represent
in an application. If, as in the case of the emoji, these numbers
(code points) start leaking out of an application, the UTC is faced
with either saying, "not my problem" or with encoding them. The
long-standing policy of only encoding plain text characters is at
odds with the fact that the PUA does not need to be used strictly
for plain text. It has been entertaining to see the calisthenics
required to justify the emoji as plain text.
This is a general problem that needs a solution. As Unicode gains
acceptance and people start realizing all the neat things they can
do with the PUA, the UTC will find itself turning many away simply
because they used the PUA in a non-text way. "Use XML" is not the
standard response I hope to see. I'd prefer that the UTC provide
guidance on how to use the PUA in such a way that facilitates the
move from PUA to Unicode proper. I've outlined one possible way to
do it, but would love to see any other ideas.
Mike
This archive was generated by hypermail 2.1.5 : Thu Jan 08 2009 - 12:49:57 CST