This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Sun Oct 27 11:17:23 CDT 2019
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #408
In “QID emoji tag sequences for flags or other symbols that represent an entity should use the QID for the flag or symbol itself if available, not the flag for the entity,” it should say “the QID of the entity” not “the flag of the entity”.
Date/Time: Tue Nov 5 10:23:02 CST 2019
Name: Charlotte Buff
Report Type: Public Review Issue
Opt Subject: PRI #408: QID Emoji Sequences
I’ve already said this on the previous PRI, but it bears repeating: QID sequences are fundamentally unworkable because they destroy the concept of character identity. I firmly believe the UTC is considerably underestimating the implications of providing a mechanism that can encode exactly the same information in several, mutually incompatible ways. Unicode was created to get rid of shenanigans like this once and for all, but the proposed QID mechanism is almost inviting this practice. Once the QID mechanism is approved, every single object or concept with an associated QID will already have a canonical Unicode representation by default. That’s the whole point of formally defining such a mechanism. Unicode *wants* vendors and private persons to use these tag sequences to represent specific emoji; the standard is explicitly endorsing it. Otherwise people could just continue using private‐use characters as usual. New emoji characters could never be added to Unicode again because they would duplicate the already existing encoding, thus invalidating all prior usage. This would be akin to the UTC formally endorsing a certain PUA assignment for a script and encouraging people to develop fonts and input methods for it, but then just encoding it properly anyway two years later. Unicode should not be in the business of creating new legacy data problems. That is why comprehensive stability policies exist. Having to tell people to stop using a perfectly valid sequence because there is a separate character for the same purpose should be avoided at all costs. There is some precedent for this in the Unicode standard due to historical accidences, for example the fact that U+0322 COMBINING PALATALIZED HOOK BELOW should not be used to compose new letters with palatalized hook. However, these cases are far and few between and only apply under very limited circumstances. QID sequences meanwhile could – by their very nature – represent almost anything in the world. Simply stating that uniqueness of representation cannot be ensured outside of RGI is not enough, because even though QID sequences aren’t RGI, they are still *official*. This is different from ZWJ sequences which have no intrinsic meaning until someone decides that they do. There is no reason why “🏴+☠️” should be the one true representation of a pirate flag compared to any other possibility; it’s completely arbitrary. But QID sequences by definition always have one and only one specific meaning even if no font is ever going to support them. If QID emoji must exist, they can only ever work if the standard very clearly states that they should only be used for things that have no chance of being encoded otherwise. This includes entities that are explicitly forbidden (deities, landmarks, celebrities, brands etc.), things that are impossible to encode (e.g. flags of regions without ISO codes), as well as specific variants of more general concepts (exact dog breeds, different types of sandwich, and so on). So Q4545971 (gelatin dessert) would not be a recommended QID because there is no reason why a gelatin dessert emoji couldn’t be encoded as an atomic character if someone submits a proposal, but Q39058 (Shetland Sheepdog) would be recommended because there already exist plenty of emoji to represent dogs and more aren’t needed in the core set. ┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈ As to whether QID sequences should be part of UTS #51 at all: The track record for generalised emoji mechanisms hasn’t been great so far. • Tag sequences for regional flags have been possible since 2017. In that time, the UTC has not RGI’d any new flags beyond the initial three, and only one vendor has ever decided to support a non‐RGI sequence: WhatsApp with 🏴 (Flag for Texas). • Hair components were added in 2018. No vendor has ever supported any sequences incorporating these characters that weren’t RGI. • The ability to modify emoji direction was added in 2018. Nobody has ever used this for anything. • Colour modifiers were added earlier this year. So far there have only been three proposals making use of these: One that was accepted (🐈️⬛️), one that was rejected (🍷⬜️), and one that was later modified to no longer use the mechanism (🐻❄️). Of note is that none of these proposals used any of the new colour characters, but only those that already existed before. • ZWJ sequences in general are very underdeveloped outside of RGI. Ignoring gender variants which always follow a set pattern, vendors haven’t used the ZWJ to invent new emoji for years; the most recent example is probably 🏴☠️. The UTC spends a lot of time and effort developing emoji mechanisms that could in theory be used to significantly expand the emoji repertoire beyond the RGI list, but ultimately always end up becoming just another tool to encode a handful of RGI sequences, never to be spoken of again afterwards. They are inaccessible to the common people and unattractive to major vendors. I see no reason why the eventual fate of QID emoji should be any different. Emoji are used exclusively for their visual appearance. Nobody cares what emoji mean in the abstract, only what they look like, and unsupported emoji sequences never look good. Considering the particularly terrible fallback display of QID sequences, I would in fact be very surprised if anyone at all ever ends up supporting even a single one of them. They have no value outside of small, isolated systems because they become utterly incomprehensible to anyone who doesn’t have the right font installed, but such closed networks usually already have much better tools in place to include pictographic images in running text. Custom emotes on Twitch or Discord cannot be exchanged outside of these sites and still be expected to show the correct glyph, but they don’t turn into featureless 🆔s like these tag sequences would; they turn into (more or less) descriptive names. To discover the intended meaning of a QID sequence in the wild, you would need to ① know that the 🆔 you encountered is actually meant to be another emoji (which is unlikely because tag characters are invisible), ② copy the sequence (which is impossible or very cumbersome in many mobile apps), ③ paste it into a tool for analysing Unicode characters (which most people do not know about), and ④ look up the resulting QID on Wikidata (which hardly anyone is aware even exists). In terms of interchangeability, QID emoji rank just barely above private‐use characters. I would argue that they are even worse in some areas, because Mozilla Firefox at least displays each tofu with a little codepoint label, whereas all unknown QID sequences would look exactly like 🆔. The recommendation to signify invalid or unrecognised tag sequences with a special “error” glyph has only been implemented by a single vendor, and only partially. Even if we assume that knowledge of QID emoji would just spread via word of mouth to such an extent that most people would know about their existence, that still doesn’t mean that they could actually be used. Installing fonts is not possible on most mobile phones without jailbreaking, and even a font that is installed on a system isn’t guaranteed to be chosen for text display in all contexts. 🆔 is most likely going to be shown in the system’s default emoji font first and foremost, and then all the other fonts in the stack that might include specific QID sequences don’t matter anymore. It’s even worse with services like Twitter that replace emoji characters with embedded images because then there is absolutely zero chance of arbitrary QID sequences displaying as intended. In practice, QID emoji are always going to be confined to a potentially small number of messaging apps that actively took the effort to develop special logic for dealing with them. This is in stark contrast to not just the rest of emoji, but Unicode in general. These QID emoji are effectively just a less portable, less versatile version of stickers. Furthermore, creating colour fonts is not something the average person can easily do. The tools necessary to do so are not freely available most of the time. Nevermind the fact that there exist four different formats for emoji fonts, none of which is compatible with any other. The New Zealand Kennel Club can’t just create a font with a glyph for Q39058 and distribute it among dog fans; they need to create four different fonts, potentially with completely different glyph designs unless they settle on an image that can be represented by all four formats. Of course, people could always create monochrome fonts instead because they work everywhere, but ⓐ black‐and‐white glyphs for emoji are not very popular among the general public, and ⓑ many of the things people would want to use QID emoji for (flags, food variants, animal breeds etc.) would be almost unrecognisable without proper colour.
Date/Time: Thu Nov 7 09:55:28 CST 2019
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 408: QID Emoji
I have been thinking about this review at times over a few days. I was about to reply a short while ago and I had a look to see if anyone else had replied since I look before. I noticed, and have read with interest, the reply from Ms Charlotte Buff. Until reading Ms Buff's comments I was going to answer question (a) as asked simply with "Yes, please add it." though I do note from the minutes of the recent UTC meeting that it is not now a matter of adding it, but having a separate document. However, now, realizing that Ms Buff makes some important comments I need to think further about it all. In principle I consider that the idea of having QID emoji is good and should be implemented. Yet maybe there needs to be thought of how this could be done whilst taking into account Ms Buff's comments. For example, encouraging fonts to have displayable glyphs for tag characters, even if just for the tag version of the letter Q, the tag versions of the ten digits and the cancel tag. I made a font that had those when I was doing some tests and it worked extremely well in the Affinity Publisher program: I chose characters to build up the QID sequence and the glyphs were displayed; when the cancel tag was entered the OpenType GSUB table in the font indicated a glyph for the complete sequence and the desired glyph for the QID emoji was displayed. Maybe Unicode Inc. could offer as a free font a font with just those twelve visible glyphs and specific giving of free permission without any "strings, or using a whatever so-called licence" to copy and paste the glyphs and their related information into any font that someone is making, offered as a free service for the public good. Maybe several such fonts so as to suit various font formats and various options within those formats (such as when some fonts use font units up to 1000 and some use them up to 2048, that sort of thing). That would help with analysis of what is going on in some cases. As for question (b) about changes in the specification, well I am perhaps going a bit off-topic but nevertheless, two matters that I think are well worth considering in relation to matters relating to the specification. (1) I know that there are views that QID emoji are not characters. I know that they are not atomic characters, but I am concerned, as I am with other sequences that are purportedly not characters, that Unicode is going to get increasingly out of synchronization at a practical applicability level with ISO/IEC 10646, notwithstanding any theoretical basis that it is not out of synchronization at a formal level with ISO/IEC 10646. To me it seems that it would be desirable to try to get some sort of agreement with the committee that manages ISO/IEC 10646 as to how both systems relate to QID emoji. I opine that an end user, at some future time, if QID emoji are implemented and possibly widely used, who sees displayed a tag sequence of a QID emoji for which he or she does not currently have a glyph of the intended QID emoji, the important practical consideration will be being able to understand what is going on. If ISO/IEC 10646 has not even a note about what it is about in general terms, then that would not be a helpful situation for that end user. I ask that Unicode Inc. raise the matter with the ISO/IEC 10646 committee please and ask their advice. (2) What if instead of a tag Q, what if another tag character were used. This would then indicate something else other than a QID emoji. For example, suppose that the tag character were an exclamation mark and the whole tag sequence code were for a localizable sentence. That would open up a lot of possibilities for communication through the language barrier. For example, the codes in the following linked document. http://www.users.globalnet.co.uk/~ngo/A_List_of_Code_Numbers_and_English_ Localizations_for_use_in_Research_on_Communication_through_the_Language_ Barrier_using_encoded_Localizable_Sentences.pdf Now I fully realize that that would most probably not be agreed to by UTC, simply because, at present, they are at a research project level and also because they are by an individual, though even if they were by a company, large or small, the answer might well be the same. Yet what if those codes, or some other codes, were an ISO standard. For the avoidance of doubt I am not in that context meaning the ISO/IEC 10646 standard. Would UTC agree to it then? Or would a different base character be better for such a system, different from QID emoji. Could you consider that please? William Overington Thursday 7 November 2019
Date/Time: Fri Nov 8 10:28:16 CST 2019
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 408: QID Emoji
I wonder if the following might help address some of the issues raised by Ms Charlotte Buff. This idea is put forward as a starting point: UTC and contributors to this Public Review are welcome to alter the idea around and improve it as desired. At the moment there is RGI (Recommended for General Interchange). What if there are instead five categories, say, as follows. Recommended for General Interchange Popular Worthwhile Interesting Noted The Recommended for General Interchange would be as now. If someone implements a QID emoji and uses it just a little then he or she may, if he or she chooses to do so, email Unicode Inc. and inform Unicode Inc. of that use, perhaps with some basic information such as an image and a note as to whether a new QID had been generated for the purpose of producing an emoji or whether a previously existing QID entry had been used, with an option of also including a note as to the motivation for implementing that particular emoji. Unicode Inc. would just check that generally and, all being well, would add it to the Noted list. That way there would be a list of what is about in use, even if just a little. Unicode Inc. could increase the category depending upon evidence of use and popularity. That way, there would be publicly accessible lists and maybe that would help. William Overington Friday 8 November 2019
Date/Time: Wed Nov 13 14:29:11 CST 2019
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 408: QID Emoji
Thinking about the proposal, it has occurred to me that if as well as tag Q for QID emoji UTC were to have in addition the facility to use tag q instead of tag Q, then tag Q could indicate a QID emoji of fixed width format and tag q could indicate a glyph based on the same QID item but not of fixed width. That way unencoded scripts could be got up and running by having a QID item for each character of the script. This would mean that the encoding would not depend upon use of the Private Use Area. An OpenType font could use glyph substitution to produce a display. Not as good as an encoding in regular Unicode, yet capable of being introduced promptly. By introducing this facility, the QID emoji proposal could have uses far beyond just emoji. William Overington Wednesday 13 November 2019
Date/Time: Mon Nov 18 18:49:58 CST 2019
Name: James Kass
Report Type: Public Review Issue
Opt Subject: PRI #408: QID Emoji Sequences
QID Emoji represents an interesting approach to plain-text. The approach is reminiscent of suggestions made in the past to the Unicode public list which were dismissed at the time. For example, the QID material database could be just as simply referenced in plain-text by the following: COMET + CIRCUMFLEX + Q + <the ID number in ASCII> + CIRCUMFLEX + COMET As the creator of the comet circumflex method notes here: http://www.users.globalnet.co.uk/~ngo/c_c00000.htm ... the comet-circumflex string is unlikely to occur elsewhere in plain-text. One advantage of the comet circumflex combination as plain-text mark-up over lengthy strings of TAG characters is that fewer bytes would be needed for each QID emoji. Which means that emoji users would be able to fit more emoji into a single tweet on Twitter. Another advantage is that BMP-only legacy software would already have a perfectly legible built-in fallback display. Legacy software could even be used for input in a pinch. (When the Comet Circumflex System was envisioned in 2002, tweet length constraints weren't much of an issue. The author uses COMET + COMBINING CIRCUMFLEX + other combining characters to mark start/end of the string, along with encircled digits instead of ASCII digits.)
Date/Time: Fri Nov 22 15:19:23 CST 2019
Name: Nicholas Felker
Report Type: Feedback on an Encoding Proposal
Opt Subject: Feedback on proposal #408 QID Emoji
I want to provide some feedback on the proposal, as I think there are pros and cons to the approach. On one hand, I do like the effort to scale emoji to enable a broader set of pictorial characters. It would certainly enable novel and unpopular emoji to be used and shared. I think I share some of the concerns on others who have feedback. With this proposal, suddenly there are the potential to have thousands or millions of emoji based on these identifiers. This would create a significant burden for font developers, especially as at launch none of these will be supported. It would need to be incorporated into system-level keyboards and fonts. The OS vendors may be unlikely to support many of these in their font, which in turn would result in a lack of keyboard support and a lack in usage. One of the cited examples, of a small dog kennel creating their own font, seems potentially feasible but unlikely. In terms of existing 'digital swag' a brand may produce, a font isn't one of them. Let's say it becomes a common thing. How would people install these fonts? On desktops it's not necessarily easy, and on mobile devices it's nearly impossible. It's even worse as one looks to a variety of simpler embedded devices like smartwatches and fitness bands which lack the openness to do anything. A text message containing a specific breed of dog would make no sense on my watch. I think a similar concern could be around the creation of 'bad faith' fonts, which may misuse the system to provide inappropriate or misleading emoji (using a Pepsi emoji instead for the Coke QID). Right now I just use the font provided by the OS, but in this proposed system one would need to download them, perhaps without a way to verify quality or keep them up to date. Some of the suggested workarounds are also not ideal. Screenreaders may read out a description of the character. For screenreaders, to get an accurate description they would need to query the Wikidata API to get a label, at least the first time. This prevents them from working offline entirely or in places with poor connectivity. A tag_base may serve as the start of any QID character to act as the default character, which could result in instances where the meaning changes dramatically (bird -> NATO flag as noted). Relying on the OS or the font to provide appropriate substitutes also seems like an issue, as it requires an Internet connection to do a lookup, as maintaining local copies of everything and keeping them up to date seems highly infeasible. > If an emoji QID sequence becomes popular, Unicode may define a different RGI representation using a character or sequence to save memory. I think this sounds good, although it is later noted "We don't anticipate having a normalization process for QID emoji.". Aside from saving memory, it suddenly becomes complicated for every vendor to do this mapping, as a QID for a dog should map to the dog emoji. In general, my feedback is that I like the idea, but I have concerns about the implementation and how it may actually work in practice. There's a lot of potential problems for things not to work, creating confusion and incomplete text. I understand that Unicode may not be able to have much control over the implementation by vendors, but providing reference material would be good for verifying the feasibility of this system at scale in real usage.
Date/Time: Mon Nov 25 10:32:13 CST 2019
Name: David Lewis
Report Type: Public Review Issue
Opt Subject: PRI #408: QID Emoji Sequences
I have to agree with others who have posted on this subject. With QID it appears that the Unicode Consortium is for some reason attempting to defeat the entire purpose of the Unicode Consortium. The entire point of Unicode is for one body to decide for all of computing what character a particular sequence of binary digits represents across all implementations around the entire world. It does slow the process of adding new symbols considerably, but in exchange a host of issues are bypassed. If everyone implements Unicode according to the standard, there will never be any more conversion errors again. The character you expect to display is the character you WILL display, if your font supports it. Private use is one thing. That is a part of the standard that is intentionally left non-standard. Those who choose to use private use section understand that they're entering uncharted waters. It has limited use, but those uses are fairly well limited to those things the designers INTEND to be limited. Use of Klingon script on a fan-website is not going to cause problems of a serious nature with other sites who might misinterpret the Klingon script they receive; they're not going to try to receive any, and if they did they wouldn't likely try to interpret it unless they knew what it was and had a Klingon font. QID seems to be taking it to another level, inviting a host of developers to create their own suites of characters in a disorganized, haphazard fashion that's bound to cause the same kinds of overlaps, gaps, and conversion mistakes that required the Unicode consortium to have to be created in the first place. People are probably going to be attempting to communicate using one QID vendor with another person using a different vendor, and all the QIDs will probably wind up not what the person intended to communicate to the other at all. I don't trust a Wiki as a governing body. I don't think we should wait a couple of years for QID to get so messed up that a body of individuals have to create a QID Consortium to bring the world to one singular global QID standard. We already have a body that brings the world to one singular global standard for Emojis. The price of standardizing QID to the point that it's rendered usable is higher still than just standardizing Emojis like you already do. If anything, you could simply alter the process to enable a larger number of Emojis to be added every Unicode release. That's much easier, in my opinion, and solves the problem far more gracefully.
Date/Time: Tue Nov 26 07:50:00 CST 2019
Name: David Lewis
Report Type: Public Review Issue
Opt Subject: PRI #408: QID Emoji Sequences
It seems to me that rather than supporting an entirely new mechanism for Unicode to support unsupported emojis, it would be far easier, more sustainable, more effective, and less burdensome on the Unicode Consortium, the public, and vendors for Unicode to just have fewer unsupported emojis. An example would be particular breeds of dog. Why do we need to completely change everything about how emojis work just to get that? We already have something like 18 characters for superhero; 6 skin tones each for male, female, and gender neutral. Why not let all vendors choose whatever breed of medium size dog they like for a default, then use ZWJ and color mechanism (if the vendor even wants to support it) for a black breed of dog, a brown one, a white one, a golden one (golden retriever obvious choice), maybe if a vendor chooses dog + orange can represent a dingo. Another ZWJ modifier for big or small could be included. Dog + small + white could be a Maltese; dog + big + white could be a sheep dog. Dog + black + white could be a dalmatian. If your font doesn't support it, oh well. If your font doesn't support it yet, oh well. Only the most popular breeds need be included first. If the first initial round of extremely popular breed ZWJ sequences get usage, you can consider including more. If we're thinking about creating an entirely new mechanism that may go completely unused, why are we so adamant about avoiding a single character or even a single ZWJ sequence that may go unused? Perhaps the Unicode Consortium could be a little more lenient in terms of potential usage estimates, and a little more lenient in terms of items that are already representable. Yes, trash can + fire does convey the same general idea as "dumpster fire", but not to the level of scale intended. "Garbage fire" is a term for a problem that hasn't reached full "dumpster fire" status. Already representable yes, but not ideally yet in many cases of rejected emojis I've seen in the past. Yes, baby face followed by drop might convey crying baby, but is it really that hard to make a glyph of a baby that IS crying to convey the idea more precisely and succinctly? If not worthy of it's own code point, could it have a ZWJ sequence? Perhaps, as a new standard operating procedure, if a proposal for a glyph does not meet requirements to be approved for it's own code point, it should enter as a contender for being approved as a new ZWJ sequence, even if the author of the proposal didn't initially think of it. It does create more work for the Consortium, but as much additional work as keeping QID from blowing up?
Date/Time: Mon Mar 2 11:35:30 CST 2020
Name: William Overington
Report Type: Feedback on an Encoding Proposal
Opt Subject: Public Review 408: QID Emoji
Having looked at this issue from time to time I write to suggest a compromise possibility that I hope that the Unicode Technical Committee will consider please. The general idea of QID emoji is, in my opinion, good. However there are disadvantages too. How about not using Q but using another capital letter and use a new wiki hosted by Unicode Inc. specifically for the purpose? Then you could have many of the benefits of QID emoji, yet also have very light moderation by an Officer of Unicode Inc. as well. This would also get rid of the issue of whether Unicode Inc. allows that particular QID to be an emoji or not. With your own wiki and your own code space and the light moderation, people can know for certain that if it has been there for more than a few days then it is a permitted emoji. You could also lock a page so that once there it could only be changed, and then only for a minor error or something like that, with the agreement of the moderator: this would provide long term stability. Also, someone could ask the moderator for allocation of a block of code numbers if the proposed emoji would be part of a coherent set. So all of the benefits of a QID emoji system but eliminating the negatives. This system could have much of the freedom that the Private Use Areas provide, yet also have unique encoding for each encoded item and also interoperability from computer to computer across various platforms. William Overington Monday 2 March 2020
Date/Time: Wed Apr 15 05:09:57 CDT 2020
Name: Jonathan Kew
Report Type: Public Review Issue
Opt Subject: Mozilla Feedback on PRI #408 “QID Emoji”
Note: This feedback refers to document L2/20-110.
Mozilla urges the Unicode Consortium not to adopt the QID Emoji proposal. The proposal provides for a mechanism for minting emoji that bypasses the normal Unicode Consortium processes. We believe this would lead to problematic effects. While the foreseeable problems could be argued to have precedent in the sense that similar problems already exist in Unicode, the precedent should be viewed as problems that shouldn't be made worse and should not be viewed as a license to let the problems proliferate. (This is a summary paragraph only; full writeup submitted to UTC via email.)
Date/Time: Mon Apr 20 13:06:34 CDT 2020
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 408: QID Emoji
The document to which Mr Kew refers is linked below. https://www.unicode.org/L2/L2020/20110-qid-emoji.pdf In that document Mr Sivonen writes as follows. > Mozilla urges the Unicode Consortium not to adopt the QID > Emoji proposal. Mr Sivonen later writes as follows. > There doesn’t appear to be a good reason to believe that if > QID emoji was implemented, the mechanism would stay scoped to > emoji and wouldn’t be used for encoding genuinely textual characters > in a way that would circumvent Unicode processes. If the choices available for the Unicode Technical Committee were to say either 'yes' or 'no' to the proposal, then I suggest that 'no' would be tne better decision. Yet the Unicode Technical Committee is not restricted to those two choices. For example, the discussion could be scheduled for the first day and the proposer could be informed that the proposal is not accepted in its proposed form, yet, if a revided proposal is submitted for consideration later in the week, after some discussions in ad hoc meetings, then the revised proposal will be considered as if it is a fresh proposal and a decision reached. So, if the idea of the QID wiki being used is dropped and a database under the control of Unicode Inc. is used instead, where there is usually very light moderation, but firm moderation is always possible, and maybe a few other changes are made if the Unicode Technical Committee opines that they are necessary or desirable, then the main intent of the proposal can become implemented, yet in a way that avoids the good name of Unicode Inc. being potentially dragged through the gutter by something that is put in a wiki over which Unicode Inc. has no control. I opine that the use of a base character and a sequence of tag characters is sound, and could be extended to other items beyond emoji, and could be a catalyst for a renaissance of creativity using information technology in an interoperable manner, yet not filling up the Unicode character map. The big problem with the original proposal is linking it to an external wiki. The Unicode Technical Committee has the opportunity to allow progress to flourish into the future by using the good parts of the original proposal in a rigorously controlled manner. William Overington Monday 20 April 2020
Date/Time: Mon Apr 20 14:31:58 CDT 2020
Name: Denny Vrandecic
Report Type: Public Review Issue
Opt Subject: PRI 408 QID Emoji - Feedback
This is formal feedback to PRI issue #408 regarding the proposal of QID Emoji. Wikidata is a Wikimedia project and follows the principles of open knowledge creation and curation that have led Wikipedia to be the project it is today. Wikidata’s goal is to allow everyone to share in an open knowledge graph that anyone can edit and use. Wikidata has more than 25,000 monthly contributors, and has seen more than 1.1 billion edits, creating more than 80 million Items. Each of these Items is identified by what we call a QID (short, for Q-Identifier, as the identifiers are starting with the letter Q and followed by a number). These QIDs are meant to be quite stable: a QID can get discontinued when an Item is deleted, but the QID then never gets reused, thus not leading to ambiguity. A QID can also be forwarded to another QID when two Items are merged, but in this case the QID and their relation is recorded. Deletions happen rarely, and by definition only for Items that are not notable. The QIDs for almost all Items of wider interest have remained stable since their creation. Wikidata provides a service to resolve QIDs and get back human- and machine-readable names and descriptions of the Items of interest. Wikidata has become a major authority hub for identity. Not because of complex processes and selective contribution requirements, but on the contrary, because of the ease of contributing and its adherence to Wikipedia’s principles of openness and inclusion. Wikidata links together several thousand databases and authority files, allowing to swiftly join data indexed with ICD identifiers and Dewey Decimal Classification codes. This has led to Wikidata being described as a crystallization point of identifiers, as an authority file of authority files, or as a modern Stone of Rosetta. Even more importantly, although Wikidata only launched a few years ago, it is already being used by a growing number of institutions as an important authority file. These institutions include, but are not limited to: The US Library of Congress The German National Library Virtual International Authority File VIAF The New York Times Google Museum of Modern Art iNaturalist Carnegie Hall MusicBrainz Open Street Maps Schema.org Quora OCLC WorldCat And many more. Given that these and other authorities are already relying on and trusting Wikidata and its open processes to curating a comprehensive and current catalogue of identifiers, we are humbled and pleased to learn about the proposal to the Unicode Consortium to consider using Wikidata QIDs as an additional approach to identify the meaning of an emoji. We understand that this would allow stakeholders to expediently introduce new emojis, be able to measure their real-world adoption, and provide unambiguous and stable emoji tag sequences. We think that this is a great application of Wikidata as an identifier catalogue, and we fully support this proposal. Lydia Pintscher, Wikimedia Deutschland, Product Manager Wikidata Denny Vrandečić, Founder Wikidata Joint statement P.S.: if of interest, the Wikidata community already records a few thousand Unicode characters as being identified with a given QID. We could think that this kind of mapping can be useful to stakeholders for example to do some form of normalization or fallback. As of the time of writing, there are 9,913 such mappings using the Property P487 (see https://w.wiki/NRB for a current list).
Date/Time: Thu Jun 4 15:04:55 CDT 2020
Name: William Overington
Report Type: Public Review Issue
Opt Subject: Public Review 408: QID Emoji
I opine that when considering a new idea it is important to be prepared to suspend disbelief and consider if any parts of the idea are good, rather than just the total idea. I opine that the QID Emoji proposal has some very good aspects but is somewhat unstable as a whole. So, if those in favour of the proposal and those against are each willing to be like the strongest trees and sway in the breeze then the good parts of the proposal could become available in a stable manner. For example, maybe registration in a Unicode Inc. database, with the option of a cross-reference link to QID, would mean that only those QID where someone wants an emoji for that QID would be in the Unicode Inc. database, and a gentle moderation policy could be used to stop ambiguity and duplication. So maybe shorter codes. What if U+FFF0 is defined, mutatis mutandis, as effectively what would be a ligature of the ID emoji and tag Q in the original proposal, U+FFF8 is defined as the corresponding CANCEL and circled digits are used. All part of the basic plane, so fewer bytes for each such character and a graceful indicative fallback facility built in. I realize that the original proposal can be implemented with existing technology, and that the changes I suggest would require changes to The Unicode Standard and also possibly software packages, but perhaps not necessarily, other than the software accepting U+FFF0 and U+FFF8 as being valid characters, but that could be done in time if there is the will to do so, yet whatever solution is implemented is likely to be there for a very long time. Would those two changes both go a long way towards making a solution that is acceptable to everybody? I may not have solved every objection and what I suggest does change the original. Yet this is research for the future. So, if people agree, please say so, if not then please say what I have missed or got wrong and what needs fixing and then, as a group effort, maybe we can iterate in a constructive way and achieve a good solution acceptable to everybody. William Overington Thursday 4 June 2020
Feedback above this line was reviewed prior-to or during UTC #165.
Date/Time: Wed Jan 6 15:36:25 CST 2021
Name: asmus
Report Type: Public Review Issue
Opt Subject: Wrong closing date for PRI #408
The web page (https://www.unicode.org/review/pri408/) lists an incorrect closing date of 2020.09.15 therefore falsely indicating that comments aren't possible and also making it impossible to know whether one is still in the window for comments. The only feasible remedy for that is to re-issue the PRI with the correct information and to not take action on it during the current UTC meeting (other than perhaps withdrawing the proposal). A./ PS: I fully endorse the comments by Ms. Buff. I believe the proposal to be fatally flawed for the reasons she articulates so well. It should be withdrawn with prejudice. PPS: whenever a PRI for substantially the *same issue* is re-issued, all the prior comments must be retained; it should not turn into a war of attrition to see which submitter tire of repeating their comments.
Date/Time: Sat Apr 10 05:39:57 CDT 2021
Contact: wjgo_10009@btinternet.com
Name: William Overingtom
Report Type: Public Review Issue
Opt Subject: Public Review 408 QID Emoji
May I add some comments in relation to my comment that has the timestamp Thu Nov 7 09:55:28 CST 2019 please? I have recently decided that each localizable sentence should have an associated glyph. I originally had a glyph for each localizable sentence, but later thought that glyphs would not be needed for many localizable sentences. However, research is research and a researcher needs to be willing to change direction if that seems desirable. So, there are already glyphs for many of the localizable sentences that I have encoded for my research project on communication through the language barrier. I have decided that each localizable sentence encoded must have a glyph. This link is to a document from some years ago, yet the content is still valid. http://www.users.globalnet.co.uk/~ngo/locse027.pdf This is a link to a current forum thread where localizable sentence glyphs are used in a poem, and seven new glyphs are introduced. https://forum.affinity.serif.com/index.php?/topic/138654-artwork-for-greetings-cards Yet the poems are just one aspect of the project. An application that I consider important is seeking information through the language barrier about relatives and friends after a disaster. An example of how this could work is available in a PDF slide show. My proposal to encode localizable sentences is with the ISO/TC 37 committee, having been forwarded by the United Kingdom National Body and there was a good possibility that my slide show would be presented by a Member of the United Kingdom delegation who had kindly offered to do so at the plenary meeting of ISO/TC 37 that was due to take place in June 2020, but alas did not take place due to the pandemic. The slide show has been with the ISO/TC 37 committee since 2019. The slide show and some other documents are available from the following web page. http://www.users.globalnet.co.uk/~ngo/localizable_sentences_research.htm I know that this idea is not QID Emoji, yet I hope that you will give it serious consideration please as part of the Public Review, as whatever format you decide to recommend for QID emoji, the same format with some small change could be used for localizable sentences. That would make localizable sentences uniquely and unambiguously encoded within regular Unicode and thus be able to be applied throughout the world by anyone who wanted to do so, free of any concerns or issues about intellectual property rights. William Overington Saturday 10 April 2021
Feedback above this line was reviewed prior-to or during UTC #167.
Date/Time: Fri Jun 11 13:02:36 CDT 2021
Name: Rick McGowan
Report Type: Public Review Issue
Opt Subject: PRI #408 - other feedback/discussion documents
The following UTC documents also contain significant feedback and/or discussion for this PRI: L2/19-082R: QID Emoji Proposal https://www.unicode.org/L2/L2019/19082r-qid-emoji.pdf L2/20-110: Mozilla Feedback on PRI #408 “QID Emoji” https://www.unicode.org/L2/L2020/20110-qid-emoji.pdf L2/21-078: Future Unicode Emoji Options https://www.unicode.org/L2/L2021/21078-emoji-future.pdf L2/21-099: On Accumulated Feedback on QID https://www.unicode.org/L2/L2021/21099-qid-feedback.pdf