From: Mark Davis (mark.edward.davis@gmail.com)
Date: Sat Dec 20 2008 - 22:22:50 CST
Thanks for your message.
There are some messages on specific character issues that people have filed
by email. They are visible on the group that Markus referred to (
http://groups.google.com/group/emoji4unicode).
We are trying to break up individual issues raised by the emails and address
them separately on http://code.google.com/p/emoji4unicode/issues/list . That
is, obvious fixes from the emails should go in, items needing discussion get
tracked as issues.
Mark
On Sat, Dec 20, 2008 at 17:14, Asmus Freytag <asmusf@ix.netcom.com> wrote:
> I'm surprised at how much of this discussion appears to be driven by prior
> conviction and how many of the arguments that are being made seem to become
> emotional. Many contributors seem to base their input purely on a value
> judgment of what they deem appropriate types of text.
>
> I think that the strength of Unicode has always been it's almost
> single-minded focus on universality. Sure, there are limitations, but they
> are based on how Unicode fits into the overall architecture of the global
> computing environment, not on the nature of the text, or the nature of the
> group of users.
>
> Architecturally, Unicode is designed to address plain text. Over time, the
> shared understanding of what is plain text has evolved - starting initially
> from the type of plain text seen in plain text environment such as old-style
> e-mail, for example, and later being expanded to encompass codes for the
> underlying text entities in markup languages, even if they aren't fully
> usable outside of such protocols. The sets of symbols for musical and
> mathematical notation contain quite a number of characters that are only
> fully functional when used with a full music composition system or
> mathematical layout (such as MathML).
>
> Nevertheless, the underlying elements are entities that can and should
> properly be encoded as plain text elements, so that they can be treated more
> uniformly inside the overall architecture. Sure, there were pre-existing
> SGML entity sets for them, but it proved beneficial to use Unicode to
> consistently encode the semantics of the entire range of these symbols,
> rather than leaving some of them to entity sets (which are limited to
> SGML-like environments). The benefits to implementers of these markup
> language of having a single, consistent representation for the entire
> textual "backbone" of a markup document is enormous.
>
> That emoji act functionally like plain text elements the way that they fit
> into the architecture of numerous existing implementations and that they are
> interchanged - about these facts there can be no reasonable disagreement.
> Pretending otherwise does not speak from the observable facts, but rather
> appears based on prior convictions and value judgments of a sort, which, I
> believe, have no place in the development the Unicode Standard.
>
> Suggestions like endorsing permanent private use code assignments or
> inventing special, stateful, mini-markup for these characters, are likewise
> driven by the desire to express a value judgment, and not by careful
> analysis of the technical requirements. Some of these suggestions were made
> by people whose sound technical judgments I had come to trust. I will have
> to be more careful in the future: these suggestions, if acted on, would do
> more harm to the Unicode Standard than admitting even an unexamined set of
> symbol characters.
>
> What is needed most, at this juncture, is not further opinionizing about
> the value of these proposed characters, but the detailed work of sorting
> them into the standard. There are enough hard questions to be answered:
>
> 1) Are there entities that can't be encoded for exceptional reasons?
> 2) What are the semantic distinctions and range of semantics to be encoded?
> 3) What to do about semantic distinctions normally represented by text
> styles?
> 4) What to do about naming?
>
> Some general observations about these questions.
>
> To 1: Logos are so far the only exception that should be applied on first
> principles. It is implicitly recognized that logos are plain text in a
> technical sense, but that there are overriding concerns that don't permit
> them to be treated as characters.
>
> To 2: This requires a careful look into the nature of the proposed
> characters. Some are presented as fairly generic embodiments of a particular
> semantic (e.g. factory) when it is well known that in other environments
> *different* symbols would be used. In such case it's important that the name
> and annotations chosen reflect the fact that the symbol to be encoded is
> *not* the most generic one, and perhaps is only the generic symbol in the
> context relevant to the subset of emoji symbols. Getting this wrong will not
> impact the technical aspects of supporting Emoji in Unicode, but will make
> it difficult to correctly support other sets of symbols at a later time.
>
> To 3: Mathematical symbols assign semantic value to what would be stylistic
> variation in other context. In principle, the same could be applied to color
> distinctions for emoji. Some emoji codes would require color for correct
> rendering, but color would otherwise remain limited to markup. Such a
> solution would be entirely parallel to what was done for mathematical
> alphabets -- however -- if there's a possible mapping to a range of textures
> (black, white, lined, hatched) that would be an acceptable way of handling
> the situation, so as to be able to sidestep this issue entirely for now, and
> perhaps for ever.
>
> To 4: Naming is always a subject where everyone has an opinion. Names don't
> matter unless they appear to express constraints on usage or glyphic
> representation of a character that don't exist. Naming symbols based on
> conventional meaning can imply that they cannot be used for more than one
> meaning - however, that can be addressed by annotations and (informative)
> aliases. Naming based on graphical constituent parts can be misleading if
> the symbols aren't really always constructed from the same parts (and such
> naming is exceedingly cumbersome). Striving for perfect names is less
> helpful than to avoid clear-cut blunders.
>
> Having said all this, why can't I find more of a discussion of individual
> characters from the proposal, e.g. in the light of the four questions I
> outlined above?
>
> A./
>
>
>
>
This archive was generated by hypermail 2.1.5 : Fri Jan 02 2009 - 15:33:07 CST