Re: Emoji: emoticons vs. literacy

From: Michael D'Errico (mike-list@pobox.com)
Date: Fri Jan 09 2009 - 01:38:56 CST

Next message: Adam Twardoch: "Re: Emoji: emoticons vs. literacy"

Previous message: John Hudson: "Re: Emoji: emoticons vs. literacy"
In reply to: John Hudson: "Re: Emoji: emoticons vs. literacy"
Next in thread: Peter Krefting: "Re: Emoji: emoticons vs. literacy"
Reply: Peter Krefting: "Re: Emoji: emoticons vs. literacy"
Reply: Andrew West: "Re: Emoji: emoticons vs. literacy"
Reply: Joó Ádám: "Re: Emoji: emoticons vs. literacy"
Reply: vunzndi@vfemail.net: "Re: Emoji: emoticons vs. literacy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Your suggestion, Michael, is to modify how the Unicode standard works in
> order to encode emoji and similar non-text content in a flexible and
> extensible way. My suggestion is that this content belongs in a
> different standard altogether, one that is focused on non-text content.

I've thought about this. But since you would want to intermix text
and non-text, it makes sense to retain Unicode as a subset and use
the same UTF encoding schemes. The problem, though, is that Unicode
claims all the code points, so a new standard would have to violate
the rules, either by using planes that Unicode will probably never
use(*), or by going beyond plane 16 (which is impossible with UTF-16
and specifically disallowed for UTF-8 and UTF-32 conformance).

Personally, I would choose the latter approach and just say that you
can't use UTF-16. UTF-8, even limited to 4 bytes, can encode a total
of 32 planes, so there would be lots of initial room. Expanding it
to 6 bytes as it was originally specified handles 32k planes.

The problem with moving beyond the reach of UTF-16 is that some
programming languages designed their String classes to hold UTF-16
code points, and would therefore not be able to access the non-text
content. This is probably the biggest roadblock to a solution
outside of Unicode, and means that either Unicode would have to give
up some of its code space to a new standard, or embrace the ideas
and make it a part of Unicode.

Well I won't be holding my breath....

Mike

*Whistler's Conjecture states that no characters will ever be encoded
beyond plane 2.

Next message: Adam Twardoch: "Re: Emoji: emoticons vs. literacy"
Previous message: John Hudson: "Re: Emoji: emoticons vs. literacy"
In reply to: John Hudson: "Re: Emoji: emoticons vs. literacy"
Next in thread: Peter Krefting: "Re: Emoji: emoticons vs. literacy"
Reply: Peter Krefting: "Re: Emoji: emoticons vs. literacy"
Reply: Andrew West: "Re: Emoji: emoticons vs. literacy"
Reply: Joó Ádám: "Re: Emoji: emoticons vs. literacy"
Reply: vunzndi@vfemail.net: "Re: Emoji: emoticons vs. literacy"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Jan 09 2009 - 01:42:26 CST