Accumulated Feedback on PRI #430

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.


Date/Time: Sun Jun 27 12:38:08 CDT 2021
Name: Alexei Chimendez
Report Type: Error Report
Opt Subject: Use of CANCEL TAG in emoji flags

(Note: This report has also been added to general feedback for the Editorial Committee.)

UTS #51 allows for the interchange of various flags through "emoji tag
sequences", specified as: an emoji character or sequence, followed by one
or more component characters from the block Tags, and terminated with the
character CANCEL TAG.

In the Unicode Standard, sec. 23.9 reads:

> There are two uses of cancel tag. To cancel a tag value of a particular
 type, prefix the cancel tag character with the tag identification
 character of the appropriate type. [...] To cancel any tag values of any
 type that may be in effect, use cancel tag without a prefixed tag
 identification character.

Continuing, it specifies:

> Inserting a bare cancel tag in places where only the language tag needs
 to be canceled could lead to unanticipated side effects if this text were
 to be inserted in the future into a text that supports more than one tag
 type.

However, the use of CANCEL TAG in flags is, in effect, a "bare cancel tag",
because it is not preceded by a tag identification character (it is only
preceded by tag component characters). The presence of an emoji flag in a
text may thus inadvertently cause the canceling of all applicable tags.

While the Standard currently only specifies one kind of tag (the language
tag, which is "strongly discouraged"), the use of CANCEL TAG in emoji flags
may cause issues if other kinds of tags are introduced in the future, or
for applications or protocols that make use of "private use" tags to signal
in-band information.

The simplest solution is to change the wording in sec. 23.9 to read:

> To cancel any tag values of any type that may be in effect, use cancel
 tag without a prefixed tag identification character or other tag
 character.

With this change, the CANCEL TAG character in the sequence

> U+1F3F4 U+E0066 U+E006F U+E006F U+E007F

has no effect and is ignored, while in the sequence

> U+1F3F4 U+66 U+6F U+6F U+E007F

the CANCEL TAG character will cancel all tags. This change prevents the
inadvertent canceling behavior of emoji tag sequences as described above.