This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.
Date/Time: Tue Mar 7 05:59:41 CST 2017
Name: William Overington
Report Type: Public Review Issue
Opt Subject: PRI #348 Length of Tag Sequences
>> Review Note: The UTC is considering limiting the total length of possible valid >> tag sequences to some value such as 32. It would appreciate feedback on whether >> the length of tag sequences should be limited, and if so, to what length. In ED-14a there is the following. >> Though tag_spec includes the values U+E0041 TAG LATIN CAPITAL LETTER A .. U+E005A >> TAG LATIN CAPITAL LETTER Z, they are not used currenty and are reserved for future >> extensions. It would be possible to specify that the length of a particular tag sequence is limited to a particular value such as 32 unless the first character of tag_spec is a TAG LATIN CAPITAL LETTER that has the effect of increasing the limit. That would mean that for many emoji characters there would be a limit, thus perhaps helping in the implementation of display technology, yet that restriction would not prevent future expansion of the system so that, for example, a vector glyph in a platform-independent colour-font-style contour format could be expressed using tag characters. By having a method so that longer tag sequences are available it will be possible in the future, if that is what people choose to do, to include in a Unicode character stream such things as a three-dimensional model of a biochemical molecule that an end user may rotate so as to observe the model from various angles without a general absolute restriction of tag sequence length stopping such a future development. I appreciate that such display possibilities are not perhaps within the scope of Unicode at the present time, yet they and other new ideas might possibly become part of the everyday use of Unicode in the future and Unicode is being built for lasting into the future, so I opine that it is important to provide the infrastructure such that new ideas can flourish. William Overington Tuesday 7 March 2017
Date/Time: Fri Mar 3 15:51:34 CST 2017
Name: Anonymous
Report Type: Public Review Issue
Opt Subject: PRI #348: editorial correction
Annex B says: "While the syntax of an emoji *emoji flag sequence* is defined in *ED-14*, ..." I suggest removing the first repetition of "emoji" (the one not in bold italics). Please consider this an anonymous submission, i.e. do not add my name to the Acknowledgments section.
Date/Time: Fri Mar 17 05:22:50 CDT 2017
Name: Christoph Päper
Report Type: Public Review Issue
Opt Subject: PRI#348 Formal Definitions
Some definitions in [section 1.4] ( http://www.unicode.org/reports/tr51/proposed.html#Definitions ) lack a formal BNF-like specification. It may be helpful to have one for all of them. They also do not follow the notation given in Appendix A of TUS exactly, e.g. `\x{ABCDEF}` instead of `\uABCD` and `\U00ABCDEF` (or `U+ABCD` and `U-00ABCDEF`) for code points. I also believe that a definition for something like _emoji presentation base_ would be good, so it could be used in ED-8a and 9a instead of the more generic _emoji character_. It may also be helpful to define terms for the columns in <http://unicode.org/emoji/charts-beta/text-style.html>. Here is my attempt at a complete grammar: ; Emoji Characters emoji_character := \p{Emoji} ; Emoji Presentation default_emoji_presentation_character := \p{Emoji_Presentation} ;= `+EP` default_text_presentation_character := [\p{Emoji} - \p{Emoji_Presentation}] ;= emoji_character - default_emoji_presentation_character ;= `-EP` ; Emoji and Text Presentation Sequences text_presentation_selector := \uFE0E text_presentation_sequence := emoji_presentation_base text_variation_selector emoji_presentation_selector := \uFE0F emoji_presentation_sequence := emoji_presentation_base emoji_variation_selector emoji_presentation_base ;= `+EPSq` := [ \u0023 \u002A \u0030-\u0039 \u00A9 \u00AE \u203C \u2049 \u2122 \u2139 \u2194-\u2199 \u21A9-\u21AA \u231A-\u231B \u2328 \u23CF \u23E9-\u23EA \u23ED-\u23EF \u23F1-\u23F3 \u23F8-\u23F9 \u23FA \u24C2 \u25AA-\u25AB \u25B6 \u25C0 \u25FB-\u25FE \u2600-\u2604 \u260E \u2611 \u2614-\u2615 \u2618 \u261D \u2620 \u2622-\u2623 \u2626 \u262A \u262E-\u262F \u2638-\u263A \u2640 \u2642 \u2648-\u2653 \u2660 \u2663 \u2665-\u2666 \u2668 \u267B \u267F \u2692-\u2697 \u2699 \u269B-\u269C \u26A0-\u26A1 \u26AA-\u26AB \u26B0-\u26B1 \u26BD-\u26BE \u26C4-\u26C5 \u26C8 \u26CF \u26D1 \u26D3-\u26D4 \u26E9-\u26EA \u26F0-\u26F5 \u26F7-\u26F9 \u26FA \u26FD \u2702 \u2708-\u2709 \u270C-\u270D \u270F \u2712 \u2714 \u2716 \u271D \u2721 \u2733-\u2734 \u2744 \u2747 \u2753 \u2757 \u2763-\u2764 \u27A1 \u2934-\u2935 \u2B05-\u2B07 \u2B1B-\u2B1C \u2B50 \u2B55 \u3030 \u303D \u3297 \u3299 \U0001F004 \U0001F170 \U0001F171 \U0001F17E-\U0001F17F \U0001F202 \U0001F21A \U0001F22F \U0001F237 \U0001F30D-\U0001F30F \U0001F315 \U0001F31C \U0001F321 \U0001F324-\U0001F32C \U0001F336 \U0001F378 \U0001F37D \U0001F393 \U0001F396-\U0001F397 \U0001F399 \U0001F39A-\U0001F39B \U0001F39E-\U0001F39F \U0001F3A7 \U0001F3AC-\U0001F3AE \U0001F3C2 \U0001F3C4 \U0001F3C6 \U0001F3CA-\U0001F3CE \U0001F3D4-\U0001F3E0 \U0001F3ED \U0001F3F3 \U0001F3F5 \U0001F3F7 \U0001F408 \U0001F415 \U0001F41F \U0001F426 \U0001F43F \U0001F441 \U0001F442 \U0001F446-\U0001F449 \U0001F44D-\U0001F44E \U0001F453 \U0001F46A \U0001F47D \U0001F4A3 \U0001F4B0 \U0001F4B3 \U0001F4BB \U0001F4BF \U0001F4CB \U0001F4DA \U0001F4DF \U0001F4E4-\U0001F4E6 \U0001F4EA-\U0001F4ED \U0001F4F7 \U0001F4F9 \U0001F4FA-\U0001F4FB \U0001F4FD \U0001F508 \U0001F50D \U0001F512-\U0001F513 \U0001F549 \U0001F54A \U0001F550-\U0001F567 \U0001F56F \U0001F570 \U0001F573-\U0001F579 \U0001F587 \U0001F58A-\U0001F58D \U0001F590 \U0001F5A5 \U0001F5A8 \U0001F5B1-\U0001F5B2 \U0001F5BC \U0001F5C2-\U0001F5C4 \U0001F5D1-\U0001F5D3 \U0001F5DC-\U0001F5DE \U0001F5E1 \U0001F5E3 \U0001F5E8 \U0001F5EF \U0001F5F3 \U0001F5FA \U0001F610 \U0001F687 \U0001F68D \U0001F691 \U0001F694 \U0001F698 \U0001F6AD \U0001F6B2 \U0001F6B9 \U0001F6BA \U0001F6BC \U0001F6CB \U0001F6CD-\U0001F6CF \U0001F6E0-\U0001F6E5 \U0001F6E9 \U0001F6F0 \U0001F6F3 ] ; Text vs. Emoji emoji_only_character := default_emoji_presentation_character - emoji_presentation_base ;= `+EP -EPSq` ;~ emoji_character - emoji_presentation_base emoji_opt-out_character := emoji_presentation_base - default_text_presentation_character ;= `+EP +EPSq` emoji_opt-in_character := emoji_presentation_base - default_emoji_presentation_character ;= `-EP +EPSq` ; Emoji Modifiers emoji_modifier := \p{Emoji_Modifier} emoji_modifier_base := [ \u261D \u26F9 \u270A-\u270D \U0001F385 \U0001F3C2-\U0001F3C4 \U0001F3C7 \U0001F3CA-\U0001F3CC \U0001F442-\U0001F443 \U0001F446-\U0001F450 \U0001F466-\U0001F478 \U0001F47C \U0001F481-\U0001F483 \U0001F485-\U0001F487 \U0001F4AA \U0001F574-\U0001F575 \U0001F57A \U0001F590 \U0001F595-\U0001F596 \U0001F645-\U0001F647 \U0001F64B-\U0001F64F \U0001F6A3 \U0001F6B4-\U0001F6B6 \U0001F6C0 \U0001F6CC \U0001F918-\U0001F91E \U0001F926 \U0001F930 \U0001F933-\U0001F939 \U0001F93C-\U0001F93E ] emoji_modifier_sequence := emoji_modifier_base emoji_modifier ; Emoji Sequences emoji_flag_sequence := regional_indicator regional_indicator emoji_tag_sequence := tag_base tag_spec tag_term tag_base := emoji_character | emoji_modifier_sequence | emoji_presentation_sequence tag_spec := [\U000E0020-\U000E007E]+ tag_term := \U000E007F emoji_combining_sequence := ( emoji_character | emoji_presentation_sequence | text_presentation_sequence ) non_spacing_mark* emoji_keycap_sequence := emoji_keycap_base emoji_variation_selector combining_keycap emoji_keycap_base := [0-9#*] combining_keycap := \u20E3 emoji_core_sequence := emoji_combining_sequence | emoji_modifier_sequence | emoji_flag_sequence emoji_zwj_element := emoji_character | emoji_presentation_sequence | emoji_modifier_sequence fully-qualified_emoji_zwj_element := default_emoji_presentation_character | emoji_presentation_sequence | emoji_modifier_sequence emoji_zwj_sequence := emoji_zwj_element ( ZWJ emoji_zwj_element )+ ZWJ := \u200D emoji_sequence := emoji_core_sequence | emoji_zwj_sequence | emoji_tag_sequence fully-qualified_emoji_zwj_sequence := ( fully-qualified_emoji_zwj_element ( ZWJ fully-qualified_emoji_zwj_element )+ ) - text_presentation_selector non-fully-qualified_emoji_zwj_sequence := emoji_zwj_sequence - fully-qualified_emoji_zwj_sequence
Date/Time: Wed Apr 5 10:22:58 CDT 2017
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #348: 3-digit flag emoji tag sequences
According to the rules in section C.1 of the proposed update, a 3-digit unicode_region_subtag is only a valid emoji flag tag if its idStatus is equal to "regular" or "deprecated". However, all the 3-digit unicode_region_subtags have an idStatus of "macroregion". Therefore, there are no valid 3-digit unicode_region_subtag flag emoji tag sequences. Defining validity rules for something such that nothing is valid might confuse some implementers, so I suggest clarifying that this is intentional.
Date/Time: Wed Apr 5 10:57:27 CDT 2017
Name: David Corbett
Report Type: Public Review Issue
Opt Subject: PRI #348: 2-letter flag emoji tag sequences
babelstone.co.uk/Fonts/Flags.html says: >> BabelStone Flags also supports Flag Emoji tag sequences for ISO 3166-1 >> two-letter country codes (i.e. the US flag can be represented either as >> <1F1FA 1F1F8> or as <1F3F4 E0075 E0073 E007F>). I am unclear from >> the rather imprecise description of Flag Emoji tag sequences given in Annex C >> of Unicode Technical Standard #51 whether this is conformant with the Unicode >> Standard or not; but it seems logical to support all geopolitical flags as tag sequences. Consider adding a note to clarify the validity of such sequences.
Date/Time: Wed Apr 19 20:55:07 CDT 2017
Name: cketti
Report Type: Public Review Issue
Opt Subject: PRI #348: Error in section C.1.3
In the table in section C.1.3 the third content row uses <BLACK FLAG> as base character in the 'Sequence' column, but uses "A" in the 'Rec. Images' and 'TU' columns. <BLACK FLAG> should be used in those columns. Furthermore, the sample sequence uses an incorrect subdivision identifier. This weakens the idea that one particular condition that makes a tag sequence ill-formed is demonstrated.
Date/Time: Wed Apr 26 16:18:17 CDT 2017
Name: Doug Ewell
Report Type: Public Review Issue
Opt Subject: PRI #348
I oppose the classification in C.1.1 and corresponding data files of three valid emoji tag sequences as "standard" and of thousands of other valid sequences, including four listed in the table, as "not standard." A potentially useful mechanism is made much less useful by this exclusion.
Date/Time: Fri Apr 28 06:46:18 CDT 2017
Name: William Overington
Report Type: Public Review Issue
Opt Subject: PRI #348 Length of Tag Sequences
Yesterday the document was changed. There is now the following. >> Review Note: The following constraint is proposed to limit the length of tag sequences, to prevent parsers from having to detail with unbounded sequences. The UTC would appreciate feedback on this. >> There is one common constraint on valid emoji tag sequences: the tag_spec must not be longer than 32 code points. I oppose that absolute limit. I opine that introducing such an absolute limit could stop progress and development. I opine that the Unicode Technical Committee should encourage new ideas for the future. In the Review Note the constraint is proposed "to prevent parsers from having to detail with unbounded sequences." There is a big difference between a tag sequence that is more than 32 code points long and a tag sequence of unbounded length. There could be a rule that there is often a limit of 32 code points and that any sequence of tag code points that is more than 32 code points in length starts with a tag character that indicates that there is more than 32 code points, a U+E007C TAG VERTICAL LINE as the first character of the sequence. Such a sequence could start by indicating its total length using a U+E007C TAG VERTICAL LINE character followed by a sequence of tag digit characters followed by another U_E007C TAG VERTICAL LINE. By making such a rule now would indicate to implementers the idea to include in their software a way to check whether a tag sequence starts with a U+E007C TAG VERTICAL LINE. That would be easy for UTC to do now and would be straightforward for developers now and would provide an infrastructure for the future. For example, in my earlier feedback I mentioned the possibility that a vector glyph in a platform-independent colour-font-style contour format could be expressed using tag characters. I later elaborated on that in two mailing list posts. http://www.unicode.org/mail-arch/unicode-ml/y2017-m04/0034.html http://www.unicode.org/mail-arch/unicode-ml/y2017-m04/0080.html Limiting the length of all tag sequences to 32 code points would stop that being implemented. For example, in my earlier feedback I mentioned the possibility that in the future, if that is what people choose to do, of including in a Unicode character stream such things as a three-dimensional model of a biochemical molecule that an end user may rotate so as to observe the model from various angles. Recently an emoji for a molecule has been proposed. Suppose that that becomes encoded as an emoji and could be applied as the base character for a tag sequence defining a particular molecule. If such a sequence starts with a U+E007C TAG VERTICAL LINE character then some tag digit characters to specify the total length of the tag sequence then another U+E007C TAG VERTICAL LINE character then such developments would be possible in a straightforward manner without disrupting the usual limit of 32 tag characters in a tag sequence. So, if UTC chooses to often, or even usually, have a 32 code point limit on the length of a tag sequence, then fine, yet while setting that limit please specify the U+E007C TAG VERTICAL LINE method suggested above so that the Unicode Technical Committee encourages progress by implementing an infrastructure for futuristic developments to be able to take place. William Overington Friday 28 April 2017
Feedback above this line was reviewed by the Emoji Subcommittee prior to UTC #151.