[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #7408(accepted dtd)

Opened 3 years ago

Last modified 5 weeks ago

Add ordinal dates

Reported by: mark Owned by: mark
Component: datetime Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

Description (last modified by mark) (diff)

In some languages, it is often customary to use ordinals in dates, such as "May 1st".

We could support that if we allowed a count parameter:

<dateFormatItem id="MMMd">MMM d</dateFormatItem>
<dateFormatItem id="MMMd" ordinal="one">MMM d'st'</dateFormatItem>
<dateFormatItem id="MMMd" ordinal="two">MMM d'nd'</dateFormatItem>
<dateFormatItem id="MMMd" ordinal="few">MMM d'rd'</dateFormatItem>
<dateFormatItem id="MMMd" ordinal="other">MMM d'th'</dateFormatItem>

We'd only allow this on skeletons that contained 'd' and 'M'.

Now, unlike other uses of count, we'd only want to expose this if we knew that the language used ordinal dates. So I think we'd want to keep an internal list of selected locales (like English and French) that we'd expose this for.

Note that we can't just mechanically add the ordinals, because there may be other changes in the pattern, and the ordinals often inflect.

The patterns would be:

MMM d
E, MMM d
MMMM d
MMM d, y
E, MMM d, y
MMM d, y G
E, MMM d, y G

Attachments

Change History

comment:1 Changed 3 years ago by pedberg

Another possible approach is the following, though it may not support the necessary inflections:

  • Define a number system keyword for ordinals (we might want to do this anyway)
  • Allow number system override strings to be attached to availableFormats items, as they can be for standard date formats (this is needed for other issues anyway, such as in Chinese cal date formats)
Last edited 3 years ago by pedberg (previous) (diff)

comment:2 Changed 3 years ago by deborah

  • Cc deborah added

comment:3 Changed 3 years ago by grhoten

  • Cc grhoten added

comment:4 follow-up: ↓ 17 Changed 3 years ago by grhoten

FYI RBNF digits already does some of this type of stuff for ordinal digits. Also some date formats would desire non-decimal forms, like Roman numerals and Chinese digits.

comment:5 Changed 3 years ago by srl

comment:6 Changed 3 years ago by emmons

  • Status changed from new to assigned
  • Cc pedberg added
  • Component changed from unknown to data
  • Priority changed from assess to major
  • Milestone changed from UNSCH to 27
  • Owner changed from anybody to mark
  • Type changed from unknown to enhancement

comment:7 Changed 3 years ago by mark

  • Milestone changed from 27 to 27dsub

comment:8 follow-up: ↓ 18 Changed 3 years ago by markus

Is it only ever the day number for which we would use "selectordinal"? What about a numeric month? (Like we read aloud German short dates, although they are simply written with the '.' for the ordinal notation.)

comment:9 Changed 3 years ago by markus

  • Phase set to dsub
  • Milestone changed from 27dsub to 27

comment:10 Changed 2 years ago by mark

  • Milestone changed from 27 to 28

comment:11 Changed 2 years ago by mark

  • Component changed from data-main to dtd

comment:12 Changed 2 years ago by mark

  • Milestone changed from 28 to 29

comment:13 Changed 2 years ago by markus

  • Type changed from enhancement to dtd
  • Component changed from dtd to unknown

comment:14 Changed 2 years ago by srl

  • Status changed from assigned to accepted

comment:15 Changed 22 months ago by emmons

  • Milestone changed from 29 to upcoming

Automatic move of all 29 -> upcoming

comment:16 Changed 16 months ago by mark

  • Keywords Review in 30 added
  • Milestone changed from upcoming to 30

comment:17 in reply to: ↑ 4 ; follow-up: ↓ 19 Changed 16 months ago by mark

  • Component changed from unknown to datetime
  • Description modified (diff)

Replying to grhoten:

FYI RBNF digits already does some of this type of stuff for ordinal digits.

Yes, but would it work for all 80ish languages? With inflections and gender?

Also some date formats would desire non-decimal forms, like Roman numerals and Chinese digits.

Agreed that that would require something else. Any thoughts? For the dates we allow a number system, like the following:
<pattern numbers="hebr">EEEE, d בMMMM y</pattern>

comment:18 in reply to: ↑ 8 Changed 16 months ago by mark

Replying to markus:

Is it only ever the day number for which we would use "selectordinal"? What about a numeric month? (Like we read aloud German short dates, although they are simply written with the '.' for the ordinal notation.)

Our focus is on written dates, and thankfully German uses just the ".". Spoken gets tricky...

comment:19 in reply to: ↑ 17 Changed 16 months ago by grhoten

Replying to mark:

Replying to grhoten:

FYI RBNF digits already does some of this type of stuff for ordinal digits.

Yes, but would it work for all 80ish languages? With inflections and gender?

Yes. For the ordinal digits, we don't really cover the spoken form because it's considered another shorthand form. So grammatical case is generally ignored for those. The gender is what matters the most. The ordinal digits like in this ticket should already be covered. Though additional vetting wouldn't hurt.

Now if we're lumping the spoken form into this mix, I thought there is another ticket to cover spoken dates and times, but I'm having difficulty finding that ticket. With the spoken form, then the case needs to be taken into account. ICU and CLDR are poorly support such use cases. Within Siri, we had to invent our own spoken formatting to take context in a sentence into account. If this ticket isn't focused on only the written form, then a lot more is needed.

Also some date formats would desire non-decimal forms, like Roman numerals and Chinese digits.

Agreed that that would require something else. Any thoughts? For the dates we allow a number system, like the following:
<pattern numbers="hebr">EEEE, d בMMMM y</pattern>

Due to the limitations of ICU formatting, I think your only viable solution is to use choice format with RBNF in some way. Though maybe the d is mapped in some way to a specific RBNF rule name and rule type. That may allow enough flexibility when your focus is on the written form and not the spoken form.

comment:20 Changed 16 months ago by mark

  • Description modified (diff)

comment:21 Changed 16 months ago by mark

I just don't understand what you are saying. We have ordinal rule support, and it is designed precisely for this kind of task. The rules are not the same as the plural rules. And we no longer ever use the ICU choice format.

If the ordinal value is "one", then <dateFormatItem id="MMMd" ordinal="one">MMM d'st'</dateFormatItem> would be used, so we would get (for English) "April 1st".

I don't see how RBNF would know which inflexion to use for the day form, since there is no way currently to pick which RBNF format to use. If you really think it can, please describe precisely what the dateFormatItem would look like, and also verify that RBNF has ordinal formats for all the "modern coverage" languages.

comment:22 Changed 16 months ago by grhoten

It sounds like there will be a meeting next Wednesday when I'm back from vacation. I'll attend that one.

In the meantime, maybe you can clarify what context you plan to use this data, and why it's only focused on ordinal digits instead of including other numbering systems in use for dates like what Peter mentioned earlier. Within the context of spoken dates and multiple calendar support, my experience has been that this is a very narrow solution for something that is English centric. I've already worked on a framework that does this, and it has to ignore this CLDR data already. My code can use any arbitrary implementation from DecimalFormat or RuleBasedNumberFormat. It's entirely possible that you have a different goal and context than me.

comment:23 follow-up: ↓ 26 Changed 15 months ago by mark

We do have some information on this. Apparently it is limited to the following languages. So if we have the necessary RBNF support for those languages (with the inflections needed for dates), then that would work.

fr1er octobre 2015 (ordinal date for the first day of the month only)
10 octobre 2015 (all other days from 2 to 31)
ms10hb Oktober
en10th October, October 10th
bn২০ শে অক্টোবর
cy10fed Hydref
gu10 મી ઑક્ટોબર
filIka-10 ng Oktubre

comment:24 Changed 11 months ago by mark

  • Milestone changed from 30 to 31

Changing to next release, will have to gather data manually

comment:25 Changed 8 months ago by mark

Thought about this a bit more: we could support french by having the ordinal form for anything but 1 simply omit the suffix.

<dateFormatItem id="MMMd" ordinal="one">d'er' MMM</dateFormatItem>
<dateFormatItem id="MMMd" ordinal="other">d MMM</dateFormatItem>

Note that we'd have to add the ordinal forms to any skeleton containing d and M+ for these languages.

comment:26 in reply to: ↑ 23 ; follow-up: ↓ 29 Changed 8 months ago by fredrik

Replying to mark:

We do have some information on this. Apparently it is limited to the following languages. So if we have the necessary RBNF support for those languages (with the inflections needed for dates), then that would work.

...

Actually, Swedish can be added to the list. In the long format we have omitted the ordinal ending and used a plain numeral since the ending varies depending on the numeral, but ideally we should use ordinal ending:
den 1:a oktober 2015
den 2:a oktober 2015
den 3:e oktober 2015
den 10:e oktober 2015

comment:27 follow-up: ↓ 28 Changed 8 months ago by mark

Would that be the case for Danish, Norwegian, etc?

comment:28 in reply to: ↑ 27 Changed 8 months ago by kent.karlsson14@…

Replying to mark:

Would that be the case for Danish, Norwegian, etc?

Danish and Norwegian (like many other languages) use "." to mark ordinality after a number written with digits. I think this is already handled in the format strings (a "." after the "d" or "dd").

comment:29 in reply to: ↑ 26 Changed 8 months ago by kent.karlsson14@…

Replying to fredrik:

Replying to mark:

We do have some information on this. Apparently it is limited to the following languages. So if we have the necessary RBNF support for those languages (with the inflections needed for dates), then that would work.

...

Actually, Swedish can be added to the list. In the long format we have omitted the ordinal ending and used a plain numeral since the ending varies depending on the numeral, but ideally we should use ordinal ending:
den 1:a oktober 2015
den 2:a oktober 2015
den 3:e oktober 2015
den 10:e oktober 2015

Except that I (and many others, I think) would say and write:

den 1:e oktober 2015
den 2:e oktober 2015

(which we had included in the date format itself earlier).

I find using "a" at the end in these cases "unnatural" (i.e. goes against my language sense), regardless of what grammar books may say.

comment:30 Changed 7 months ago by mark

  • Keywords Review in 30 removed
  • Milestone changed from 31 to 32

comment:31 Changed 5 weeks ago by mark

  • Priority changed from major to critical
  • Milestone changed from 32 to 33

We didn't get enough time to add this; bumping to critical for next time.

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.