[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #11119(accepted datatest)

Opened 3 months ago

Last modified 3 months ago

Uncaught emoji collisions

Reported by: mark Owned by: mark
Component: emoji Data Locale:
Phase: dvet Review:
Weeks: Data Xpath:


While finishing off the constructed names for hair styles as part of https://unicode.org/cldr/trac/ticket/10997, I ran into the following.

The CheckCollisions isn't thorough enough for the emoji. The problem is that unlike the other collision tests, many of the emoji names (flags, people with skin tones, etc) don't really exist in the survey tool, but are instead constructed. So those are not checked for collision in the ST.

When I fleshed out a test, I ran into a small number of such cases that slipped into the release.

Error: (TestAnnotations.java:202) Duplicate name in km: “ជប៉ុន” for 🇯🇵 & 🈹
Error: (TestAnnotations.java:202) Duplicate name in ne: “टर्की” for 🇹🇷 & 🦃
Error: (TestAnnotations.java:202) Duplicate name in sd: “ترڪي” for 🇹🇷 & 🦃
Error: (TestAnnotations.java:202) Duplicate name in sw: “Uingereza” for 🇬🇧 & 🏴󠁧󠁢󠁥󠁮󠁧󠁿
Error: (TestAnnotations.java:202) Duplicate name in ta: “அமெரிக்கா” for 🇺🇸 & 🌎
Error: (TestAnnotations.java:202) Duplicate name in zh_Hant: “日本” for 🇯🇵 & 🗾
Error: (TestAnnotations.java:202) Duplicate name in zh_Hant_HK: “日本” for 🇯🇵 & 🗾

So, for example, Chinese (T) is translating both the flag name and the map name the same. "ta" is translating the US flag and the globe showing the Americas the same. And so on.

I think the other constructed names should be ok, so I propose

  1. just adding a special check that no emoji name is the same as the country names or (the 3) subdivision names.
  2. reenabling and improving a full unit test for collisions that was commented out.

I don't think this rises to the level of needing to be listed in the Known Issues.


Change History

comment:1 Changed 3 months ago by pedberg

  • Status changed from new to accepted
  • Cc pedberg added
  • Component changed from unknown to emoji
  • Priority changed from assess to major
  • Phase changed from dsub to dvet
  • Milestone changed from UNSCH to 34
  • Owner changed from anybody to mark
  • Type changed from unknown to datatest

Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.