Re: PUA convention ID tags

From: Doug Ewell (doug@ewellic.org)
Date: Sat Jan 03 2009 - 16:52:12 CST

  • Next message: Asmus Freytag: "Re: Emoji: emoticons vs. literacy"

    John Hudson <john at tiro dot ca> wrote:

    > I don't think PUA characters should be used to encode emoji any more
    > than I think standardised Unicode characters should be used to encode
    > emoji. It seems to me that we're looking at encoding as textual
    > characters things that in important respects do not behave like
    > textual characters only because someone else has has treated them as
    > textual characters (for the purpose of telecom transmission).

    I agree 100% with what John says, notwithstanding my earlier post that
    the use of PUA characters is not evil in the abstract. In fact,
    supporters of emoji have even stated that the reason UTC is obliged to
    encode them is that their hands are tied by the Japanese cell phone
    vendors already having done so.

    > Since, other than transmission, emoji do not behave like other text --
    > they are not supported by normal text layout and font interaction, but
    > as inline graphics --, it seems to me that what we're looking at is
    > not character encoding as we typically understand it but transmission
    > code standardisation. What the telecom companies need is a reliable
    > way for one device to tell another device that emoji graphic X should
    > be displayed; i.e. they need to send some kind of identifier from one
    > device to another.

    In fact, there are already technologies for representing non-text
    objects in a plain text stream. One of them is called SGML, and it
    served as the foundation for others called HTML and XML. There are
    rumors that some people have learned to use these formats in certain
    applications.

    In all seriousness, John's suggestion of "some kind of identifier" is
    not only a better solution for Unicode, but for the cell phone vendors
    as well. They could define a new 0x1B escape code in the ETSI character
    set (see http://www.unicode.org/Public/MAPPINGS/ETSI/GSM0338.TXT for
    details) to signify that an emoji index (possibly numeric, possibly
    symbolic) follows.

    With an open-ended mechanism like this, they could expand their emoji
    repertoires easily and almost limitlessly. By registering new emoji
    virtually on demand, perhaps even with direct customer involvement, they
    would be in a perfect position to satisfy the major stated needs of
    their customer base.

    > They have been using character codes because it seems convenient, but
    > that doesn't imply that this is the only or best method, and it
    > certainly doesn't imply that everything that gets transmitted as text
    > is text or is suitable content for a text encoding standard. I might
    > as easily use a character code as a trigger to play a sound file as to
    > display an inline graphic; that doesn't make the sound file a
    > character.

    Perfectly correct, and perfectly stated.

    --
    Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
    http://www.ewellic.org
    http://www1.ietf.org/html.charters/ltru-charter.html
    http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ
    


    This archive was generated by hypermail 2.1.5 : Sat Jan 03 2009 - 16:54:15 CST