From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Tue Jan 06 2009 - 13:54:48 CST
On 1/6/2009 3:47 AM, André Szabolcs Szelp wrote:
> Dear list members,
> especially those favouring the encoding of emoji.
>
>
> Please address the inquiry below (already posted earlier, but unduly ignored):
>
>
>
>> Actually, even in the domain of emoji, how do you
>> define character identity? How do you know that a
>> "Chick" is a different character entity of "Hatching
>> Chick", how do you know they are not mere *glyph
>> variants* of the character FLEDGELING??
The thoroughly pragmatic take is that you look at how the telcos map the
sets and then you provide enough code points to cover the unique
members. In other words, even if these characters are all glyph variants
of each other in normal Unicode terms, you would apply
source-set-separation rules on them to allow the roundtripping of
distinctions made in the vendor sets.
Since the whole point of the exercise is to represent these vendor sets
with compatibility characters, that's in fact the correct procedure to
follow.
In cases where one is very certain, one could provide a compatibility
decomposition between separately encoded variants, but it's more
flexible to handle that kind of relation outside the standard with
mapping tables.
A very careful determination needs to be made about which characters
would qualify as *ordinary* characters in Unicode (i.e. have
well-established identities, are more like other symbols, don't use
animation/color. etc. etc.). Having these encoded in a section apart
from the more problematic characters would make it easier to decide for
later users (not solely tied to telco interoperability) which characters
to support.
A./
This archive was generated by hypermail 2.1.5 : Tue Jan 06 2009 - 13:56:09 CST