Public Review Issues

Accumulated Feedback on PRI #405

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Tue Sep 24 18:20:35 CDT 2019
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 405: Proposed Update UTS #51, Unicode Emoji

Annex C.2 QID Emoji Tag Sequences includes, in the proposed update, the
following.
 
> A subset of QIDs are associated with entities that would be valid for
> emoji. For example, risk management (Q189447) and this (Q3109046) would not
> be valid.
 
I opine that introducing QID emoji is a marvellous forward-looking
innovation.
 
I opine that restricting which QID items can be used for a QID emoji is an
unnecessary restriction that could have the effect of restricting some new
ideas from being implemented in an interoperable manner. I ask that the
Unicode Technical Committee remove that restriction please.
 
William Overington
 
Wednesday 25 September 2019

Date/Time: Fri Sep 27 08:54:07 CDT 2019
Name: Yannis Haralambous
Report Type: Public Review Issue
Opt Subject: QID Emojis

In my humble opinion, QID Emojis may very well become a major turning point
in human communication: *for the first time billions of people will use
semantically annotated entities in everyday informal communication*.

Well understood one may argue that this will be implicit for most of the
people using emojis and that will not have the slightest idea on the
Wikidata items they are referring to, but the  information will nevertheless
be there, hidden inside the emojis and accessible to clever  applications
that may:

- show, on demand, the author/reader additional information about the 
	emoji by a pop-up menu connected with Wikidata or Wikipedia
- do a semantic coherence check and alert the user if he/she uses an 
	emoji that is incompatible with the context
- like a spelling checker, propose other emojis which fit the context 
	more appropriately, learn from the user's subsequent choices, etc.

In other words, one will be able to process QID emojis like linguistic
entities by  accessing directly their semantics, without any need for
disambiguation (unless the emoji has been badly chosen, in which case the
user can be alerted and corrections can be make, as in any standard textual
using a given natural language).

Once we open the door of semantic annotation, applications will be endless.
NLP software will be able to access the meaning of QID emojis and, out of
it, disambiguate the surrounding textual context, QID emojis may replace
hash tags in social media since they will be  infinitely more precise and
language-independent, etc.

In the history of the Chinese writing system there has been a shift from
pictography to logography combined with phonography (many current CJK
characters having a semantic and a phonetic component). QID emojis may
become "semantic components" in the same way, and as such they can be used
in any language (the textual language providing phonetic and imprecise
semantic information, while emojis provide precise semantic information).

For all these reasons, I strongly support this proposal and hope it will be
accepted.

Date/Time: Fri Sep 27 12:47:44 CDT 2019
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 405: Proposed Update UTS #51, Unicode Emoji

QID emoji and screen readers

The issue of screen readers is mentioned in the document.

I have thought of a possible solution.

Here is the idea.

Decide what text, in any Unicode characters that you wish in any language
you choose, is to be the text that the screen reader speaks.

Save that text as a UTF-8 byte sequence.

Encode that text in its UTF-8 form to produce a text string twice as long as
that UTF-8 string such that, byte by byte, each UTF-8 byte is encoded as two
hexadecimal "digits" each in the range 0..9, A..F and then use the tag
version of each of those characters.

Add a U+0020 SPACE character at the front as the base character and add a
cancel tag character at the end.

Include that string in the document after the QID emoji character.

Screen reader software written for the purpose could decode the tag
characters into a string and try to speak it out.

Other software would just ignore the tag characters and display the space
character.

William Overington

Friday 27 September 2019

Date/Time: Mon Sep 30 11:03:33 CDT 2019
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 405: Proposed Update UTS #51, Unicode Emoji

In the current Public Review Issue there are various issues for which
comments are requested.

I responded to five questions that were in an earlier version of the draft
Unicode Emoji document. There may be differences in the wording of the
questions - Issue 4 starts differently now - yet the issues appear to be much
the same.

My replies at that time are conserved in the second item in the Encoding
Feedback section of the L2/19-272 document.

http://www.unicode.org/L2/L2019/19272-pubrev.html#Encoding_Feedback

I mention this for the record in case the Unicode Technical Committee might
like to consider them as part of this public review.

William Overington

Monday 30 September 2019

Date/Time: Tue Oct 1 18:54:53 CDT 2019
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #405 comments and questions

There is a review note to consider adding the text “Two adjacent emoji never
“merge” to form a single emoji, unless the second of the two is an
emoji_modifier. This means that only a limited number of characters can
"extend" an emoji sequence. Parsing can stop unless they are encountered.”
There is another case: two adjacent regional symbols (Emoji=Yes and
Emoji_Modifier=No) form a single emoji.

One of the review notes says “The notation 1 used in this document is
representing the Unicode character U+E0030 TAG DIGIT ZERO”. “1” should be
“0”.

“triceratops” should be “Triceratops”, which its English label in Wikidata.

It might be a good idea to explicitly state that QID emoji for flags and
similar symbols should use the QID for the symbol, not for the symbolized
entity. For example, the flag of NATO emoji should use Q459788 (flag of
NATO), not Q7184 (NATO). But what if there is no separate QID for the
symbol? Should the vendor add an item to Wikidata, or use the existing QID
of the symbolized entity instead?

“The term understandable means most people familar with the entity should be
able to tell that the representation is intended to depict the entity,
without foreknowledge. Symbols such as ♅ U+2645 Uranus are thus excluded.”
On the contrary: anyone familiar with Q3594854 (Uranus symbol) would find
⟨♅⟩ understandable. What is understandable for one QID might not be
understandable for another. The example should clarify that U+2645 should
be excluded for Q324, not necessarily for every possible QID.

“familar” should be “familiar”.

Can QIDs be deleted or moved on Wikidata? If so, might an old QID be repurposed 
with a new meaning?

Feedback above this line was reviewed during UTC #161 in October, 2019.

Date/Time: Wed Nov 27 10:56:50 CST 2019
Name: Matthew W Morgan
Report Type: Other Question, Problem, or Feedback
Opt Subject: Emoji alt text

Hello,

When using emojis and screen readers on the Android operating system at least, 
the new emojis have not been labeled. I have submitted this issue with Google, 
but I believe on iPhone they are not labeled either. It's amusing, the new blind 
person emoji is not labeled. Is this something that needs to be voted on before 
it is implemented? If so during the next round of emoji releases, please include 
the alternative text for the blind users when considering implementation.

Thank you,
Matthew Morgan

Date/Time: Fri Nov 29 10:37:52 CST 2019
Name: Jamie Stroud
Report Type: Error Report
Opt Subject: report typo

https://unicode.org/reports/tr51/

"Implementation may support any of the following for display, editing, sor input:"

or input

good intentions, from:
jamie

Date/Time: Fri Jan 3 07:33:53 CST 2020
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: PRI #405: Inconsistencies in Emoji Data Files

In emoji-sequences.txt, character ranges for the Basic_Emoji property are
not labeled correctly. For example, the entire range 2648..2653 is given the
description “Aries” even though it contains 12 different emoji.

In emoji-zwj-sequences.txt, the different variants of the new “mx. claus”
emoji are not sorted correctly. The base version is in the category “Other”,
while the Fitzpatrick‐type variants are placed among the hand‐holding
sequences in the category “Family”.