[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #9859(accepted survey)

Opened 6 months ago

Last modified 7 weeks ago

Arabic decimal short format for million not correct

Reported by: malte.wedel@… Owned by: srl
Component: survey Data Locale: ar
Phase: dsub Review:
Weeks: Data Xpath: numbers/decimalFormats/decimalFormatLength[type=short]/decimalFormat
Xref:

Description

Hello,

we got a complaint from a customer that the short format for million is not correct. Our language expert confirmed that the current value for million (as well as billion and higher) are missing a character (as in the long format) and are not correct. He said the long format cannot be shortened and should also be used as short format.

Version 30, downloaded from
http://unicode.org/Public/cldr/30/core.zip

File
common/main/ar.xml

Data

<decimalFormats numberSystem="latn">

...
<decimalFormatLength type="short">

<decimalFormat>

The the following lines

<pattern type="1000000" count="zero">0 مليو</pattern>
...
<pattern type="100000000000000" count="other">000 ترليو</pattern>

should be as in the long pattern

<pattern type="1000000" count="zero">0 مليون</pattern>
...
<pattern type="100000000000000" count="other">000 تريليون</pattern>

Can you please check whether the current short formats for million and higher are wrong and correct them?

Best Regards,
Malte Wedel

Attachments

Change History

comment:1 Changed 5 months ago by emmons

  • Status changed from new to accepted
  • Data Locale changed from ar_SA to ar
  • Priority changed from assess to major
  • Phase changed from dsub to rc
  • Milestone changed from UNSCH to 30.0.1
  • Owner changed from anybody to fredrik

comment:2 Changed 5 months ago by pedberg

  • Cc fredrik, mark added
  • Phase changed from rc to dsub
  • Milestone changed from 30.0.1 to 31

Some questions about this. Moving to 31

comment:3 Changed 5 months ago by fredrik

Based on the feedback I received, the short form is basically omitting the last "n" in million, billion, etc. There is no natural way of abbreviating the word, but this way at least we have a shorter form, whereas if we follow the suggestion here, the long and short form would be the same and there is no flexibility.

Suggest we seek input from other linguists before we proceed on this.

comment:4 Changed 5 months ago by malte.wedel@…

I have talked to two native Arabic speakers from our translation group. Both said this is not a valid abbreviation, but is like using "millio" in English as an abbreviation for million. This also would obviously be considered as wrong. I understand that it would be useful to have a shorter form, but if it just doesn't exist in this language, you shouldn't invent one.

comment:5 Changed 5 months ago by fredrik

Per our Arabic linguist:
===
Arabic does not usually have abbreviations. And in this specific case, there's no correct abbreviation that can be used.
it's true that the abbreviation used currently is not actually a proper abbreviation, and not a word that every Arabic-speaker can be familiar with. However, this is the best we can do to shorten it, even if it’s a one-character shortening. All other shorter options (for example, "م", "مل", "ملي") will not work for "million" or "billion" as they're already used for other terms like "meter," "mile," "milli-liter," and "millimeter".
I always advise to use the long format rather than the short one, but we add this short version just to have one.
===
I can see the reporter's point about not inventing a form, but also note that the CLDR client can also opt for the complete long form. By maintaining both the long form and the (invalid) abbreviation, we provide an option, to be weighed against providing only the long form for both and potentially causing clippings in the software that will obfuscate the meaning further.
It's a tricky issue. Perhaps we could keep the abbreviation, but include a note about it in the release notes or errata?

comment:6 Changed 5 months ago by malte.wedel@…

Thanks for looking into this. Of course there is a long form which can be used instead, but as we want the short form for all other languages, we would need to create a mapping table for which languages we need to take the long form instead of the short one, even if the application requested the short one?
While I see the good intention, I see no value in having a short form, which is unknown and unexpected for users. This is considered as a bug in the software. In their perception there is a letter missing or the word is cut off and they will report this as an error (as they did in our case).

comment:7 Changed 5 months ago by fredrik

OK, we'll bring this up at the next committee meeting.

comment:8 Changed 5 months ago by mark

I propose that we add information that lets the translators indicate the status of the short form.

  1. The short form is customary and well understood.
  2. The short form would be recognized in context, even though it is not standardized
  3. There is no understandable short form (in which case, the type="short" can be skipped)

<unitLength type="short">

<unitStatus status="customary">
<compoundUnit type="per"> …

This could be done with the ST, or could be done outside of the ST (like we do 12 vs 24 hour preferences).

Last edited 5 months ago by mark (previous) (diff)

comment:9 Changed 2 months ago by fredrik

  • Phase changed from dsub to rc

comment:10 Changed 7 weeks ago by fredrik

  • Phase changed from rc to dsub
  • Type changed from data to survey
  • Component changed from numbers to survey
  • Milestone changed from 31 to 32

This seems more like a structure change then, or done solely in survey tool. Either way it should be pushed to 32.

comment:11 Changed 7 weeks ago by fredrik

  • Owner changed from fredrik to srl

Putting on srl

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.