[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #11472(reviewing)

Opened 6 weeks ago

Last modified 2 days ago

Evaluate uniqueness of Unit part of Unit Identifiers

Reported by: shane Owned by: shane
Component: docs-spec Data Locale:
Phase: dsub Review: mark
Weeks: .3 Data Xpath:
Xref:

Description

UTS#35 defines "unit identifier" (https://unicode.org/cldr/trac/changeset/14503) and it implies that the unit identifier is unique, but it does not discuss the uniqueness of the unit.

The unit uniqueness is implied by:

  • All units are unique among all existing data (CLDR and spec examples).
  • 6.1 per Unit patterns algorithms [1].

Recommended spec updates:

Basically:

  1. Add paragraph "Implementations can use either the <em>unit identifier</em> or its <em>unit</em> part (e.g., <code>day</code>) to identify a unit." after unit identifier definition.
  2. Clarify existing unit examples on 6.1 per Unit patterns algorithms:

ii.a. kilometer-per-hour for N/D already available.
ii.b. kilogram-per-second for otherwise case.

Thanks

1: Details why 6.1 per Unit patterns implies unit uniqueness:

  • Which D (demoninator) to use if compound form for N/D isn't available? The spec text defines how to format a compound form, which basically is "use a direct match or generate it yourself by picking the parts individually and compose them using a certain pattern". For example, let's format throughput-megabyte-per-second. CLDR provides digital-megabyte and duration-second, but no precomputed form for megabyte per second. We need to pick a second for the D and therefore we can pick duration-second. What if unit parts aren't guaranteed to be unique? Let's suppose we also had a foo-second. Which of the two we would pick? Conclusion: duplicate unit parts leads us to non-determined solution here.
  • Still looking at the above example, all types (throughput, digital and duration) are completely irrelevant. They are not helpful for identifying the N/D parts.
  • How to name a unit id? Note I made up throughput in the example above. On CLDR we don't have any throughput example. Nevertheless, the example is completely valid and works fine using existing CLDR data. If only unit identifier is required, type is required. How would user guess type of something that is not even documented? The unit megabyte-per-second is unambiguous on its own. Figuring out its type could represent a challenge. Is it throughput? Is it bandwidth? Note this is not particular problematic to "custom" compound units. For example, meter-per-second. If I use speed-meter-per-second, I get a direct match. If I use velocity-meter-per-second (generated by mistake), both output could be different.

Attachments

Change History

comment:1 Changed 6 weeks ago by mark

Added comment for Rafael:

An implication of the above is that the only unit that uses the term generic is temperature-generic. No other type will be able to use generic as unit part. That being said, I cannot think of any unit that should be named generic anyway.

comment:2 Changed 3 weeks ago by mark

  • Component changed from units to docs-spec

comment:3 Changed 3 weeks ago by shane

I made a few additional revisions based on Rafael's revisions to specify exactly what a compound unit string should look like and the cases when compound unit strings are supported by CLDR data.

https://gist.github.com/sffc/8098d939ee597dd3eed87ac768a62075/revisions

comment:4 Changed 3 weeks ago by shane

I was also thinking that while we are working on the syntax for units and unit identifiers, I'd also like to specify syntax for "private use" units.

I suggest that they look something like this:

pu-type := "x-" type
pu-simple-unit := "x-" simple-unit

Furthermore, the substring "x-" should be forbidden from appearing in any unit other than private use units.

This would allow you to construct units such as:

length-x-jupiter-radius

and

x-bitrate-x-megabit-per-second

where x-jupiter-radius and x-megabit are custom simple units, x-bitrate is a custom unit type, and x-megabit-per-second combines a custom simple unit with an "official" simple unit to make a compound unit.

comment:5 Changed 3 weeks ago by shane

  • Cc mark added

comment:6 Changed 2 weeks ago by shane

  • Keywords ecmascript added
  • Owner changed from anybody to shane
  • Weeks set to .3
  • Status changed from new to accepted
  • Priority changed from assess to major

@Mark: What is the correct milestone?

comment:7 Changed 2 weeks ago by shane

  • Status changed from accepted to reviewing
  • Review set to mark

comment:8 Changed 6 days ago by mark

  • Status changed from reviewing to closed
  • Resolution set to fixed

comment:9 Changed 5 days ago by shane

  • Status changed from closed to reviewfeedback

According to notes on CLDR meeting on 11/14/2018, it was agreed to rename the unit part of the unit identifier to "core unit identifier". I will make this change.

comment:10 Changed 5 days ago by shane

  • Milestone changed from UNSCH to 35

comment:11 Changed 4 days ago by shane

  • Status changed from reviewfeedback to reviewing
  • Resolution fixed deleted

As the CLDR committee agreed, I changed the spec to say "core unit identifier" for the untyped identifier.

comment:12 Changed 2 days ago by shane

I added these changes to the "Modifications" section.

View

Add a comment

Modify Ticket

Action
as reviewing
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.