From: Doug Ewell (doug@ewellic.org)
Date: Sat Jan 03 2009 - 16:52:12 CST
John Hudson <john at tiro dot ca> wrote:
> I don't think PUA characters should be used to encode emoji any more
> than I think standardised Unicode characters should be used to encode
> emoji. It seems to me that we're looking at encoding as textual
> characters things that in important respects do not behave like
> textual characters only because someone else has has treated them as
> textual characters (for the purpose of telecom transmission).
I agree 100% with what John says, notwithstanding my earlier post that
the use of PUA characters is not evil in the abstract. In fact,
supporters of emoji have even stated that the reason UTC is obliged to
encode them is that their hands are tied by the Japanese cell phone
vendors already having done so.
> Since, other than transmission, emoji do not behave like other text --
> they are not supported by normal text layout and font interaction, but
> as inline graphics --, it seems to me that what we're looking at is
> not character encoding as we typically understand it but transmission
> code standardisation. What the telecom companies need is a reliable
> way for one device to tell another device that emoji graphic X should
> be displayed; i.e. they need to send some kind of identifier from one
> device to another.
In fact, there are already technologies for representing non-text
objects in a plain text stream. One of them is called SGML, and it
served as the foundation for others called HTML and XML. There are
rumors that some people have learned to use these formats in certain
applications.
In all seriousness, John's suggestion of "some kind of identifier" is
not only a better solution for Unicode, but for the cell phone vendors
as well. They could define a new 0x1B escape code in the ETSI character
set (see http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT for
details) to signify that an emoji index (possibly numeric, possibly
symbolic) follows.
With an open-ended mechanism like this, they could expand their emoji
repertoires easily and almost limitlessly. By registering new emoji
virtually on demand, perhaps even with direct customer involvement, they
would be in a perfect position to satisfy the major stated needs of
their customer base.
> They have been using character codes because it seems convenient, but
> that doesn't imply that this is the only or best method, and it
> certainly doesn't imply that everything that gets transmitted as text
> is text or is suitable content for a text encoding standard. I might
> as easily use a character code as a trigger to play a sound file as to
> display an inline graphic; that doesn't make the sound file a
> character.
Perfectly correct, and perfectly stated.
-- Doug Ewell * Thornton, Colorado, USA * RFC 4645 * UTN #14 http://www.ewellic.org http://www1.ietf.org/html.charters/ltru-charter.html http://www.alvestrand.no/mailman/listinfo/ietf-languages ˆ
This archive was generated by hypermail 2.1.5 : Sat Jan 03 2009 - 16:54:15 CST