CLDR Ticket #10290(closed: fixed)
RBNF for Indian English
Reported by: | grhoten | Owned by: | grhoten |
---|---|---|---|
Component: | numbers-rbnf | Data Locale: | |
Phase: | rc | Review: | sascha |
Weeks: | Data Xpath: | ||
Xref: |
Description
When pronouncing numbers in Indian English, it's common to use most of the English words for RBNF, but a few Indian (Hindi/Urdu) words start creeping into the language for larger numbers. I've been told that all of India uses these words for cardinal numbers. So it would be ideal if this data was labeled as en_IN. While a locale like en-t-hi-h0-hybrid could technically be used, this is closer to how Swiss French and French in France differ by their number pronunciation.
Below are the proposed numbers. The main difference is that lakh and crore are used instead of million and billion for cardinals/numbering variants. Everything else should be the same as base English, including ordinals.
%%lenient-parse: &[last primary ignorable ] << ' ' << ',' << '-' << ''; %%2d-year: 0: hundred; 1: oh-=%spellout-numbering=; 10: =%spellout-numbering=; %spellout-numbering-year: -x: minus >>; x.x: =#,##0.#=; 0: =%spellout-numbering=; 1010/100: << >%%2d-year>; 1100/100: << >%%2d-year>; 2000: =%spellout-numbering=; 2010/100: << >%%2d-year>; 2100/100: << >%%2d-year>; 3000: =%spellout-numbering=; 3010/100: << >%%2d-year>; 3100/100: << >%%2d-year>; 4000: =%spellout-numbering=; 4010/100: << >%%2d-year>; 4100/100: << >%%2d-year>; 5000: =%spellout-numbering=; 5010/100: << >%%2d-year>; 5100/100: << >%%2d-year>; 6000: =%spellout-numbering=; 6010/100: << >%%2d-year>; 6100/100: << >%%2d-year>; 7000: =%spellout-numbering=; 7010/100: << >%%2d-year>; 7100/100: << >%%2d-year>; 8000: =%spellout-numbering=; 8010/100: << >%%2d-year>; 8100/100: << >%%2d-year>; 9000: =%spellout-numbering=; 9010/100: << >%%2d-year>; 9100/100: << >%%2d-year>; 10000: =%spellout-numbering=; %spellout-numbering: -x: minus >>; Inf: infinity; NaN: not a number; 0: =%spellout-cardinal=; %spellout-numbering-verbose: -x: minus >>; Inf: infinity; NaN: not a number; 0: =%spellout-cardinal-verbose=; %spellout-cardinal: -x: minus >>; x.x: << point >>; Inf: infinite; NaN: not a number; 0: zero; 1: one; 2: two; 3: three; 4: four; 5: five; 6: six; 7: seven; 8: eight; 9: nine; 10: ten; 11: eleven; 12: twelve; 13: thirteen; 14: fourteen; 15: fifteen; 16: sixteen; 17: seventeen; 18: eighteen; 19: nineteen; 20: twenty[->>]; 30: thirty[->>]; 40: forty[->>]; 50: fifty[->>]; 60: sixty[->>]; 70: seventy[->>]; 80: eighty[->>]; 90: ninety[->>]; 100: << hundred[ >>]; 1000: << thousand[ >>]; 100000: << lakh[ >>]; 10000000: << crore[ >>]; 1000000000000: << trillion[ >>]; 1000000000000000: << quadrillion[ >>]; 1000000000000000000: =#,##0=; %%and: 1: ' and =%spellout-cardinal-verbose=; 100: ' =%spellout-cardinal-verbose=; %%commas: 1: ' and =%spellout-cardinal-verbose=; 100: , =%spellout-cardinal-verbose=; 1000: , <%spellout-cardinal-verbose< thousand[>%%commas>]; 1000000: , =%spellout-cardinal-verbose=; %spellout-cardinal-verbose: -x: minus >>; x.x: << point >>; Inf: infinite; NaN: not a number; 0: =%spellout-numbering=; 100: << hundred[>%%and>]; 1000: << thousand[>%%and>]; 100000: << lakh[>%%commas>]; 10000000: << crore[>%%commas>]; 1000000000000: << trillion[>%%commas>]; 1000000000000000: << quadrillion[>%%commas>]; 1000000000000000000: =#,##0=; %%tieth: 0: tieth; 1: ty-=%spellout-ordinal=; %%th: 0: th; 1: ' =%spellout-ordinal=; %spellout-ordinal: -x: minus >>; x.x: =#,##0.#=; Inf: infinitieth; 0: zeroth; 1: first; 2: second; 3: third; 4: fourth; 5: fifth; 6: sixth; 7: seventh; 8: eighth; 9: ninth; 10: tenth; 11: eleventh; 12: twelfth; 13: =%spellout-numbering=th; 20: twen>%%tieth>; 30: thir>%%tieth>; 40: for>%%tieth>; 50: fif>%%tieth>; 60: six>%%tieth>; 70: seven>%%tieth>; 80: eigh>%%tieth>; 90: nine>%%tieth>; 100: <%spellout-numbering< hundred>%%th>; 1000: <%spellout-numbering< thousand>%%th>; 1000000: <%spellout-numbering< million>%%th>; 1000000000: <%spellout-numbering< billion>%%th>; 1000000000000: <%spellout-numbering< trillion>%%th>; 1000000000000000: <%spellout-numbering< quadrillion>%%th>; 1000000000000000000: =#,##0=.; %%and-o: 0: th; 1: ' and =%spellout-ordinal-verbose=; 100: ' =%spellout-ordinal-verbose=; %%commas-o: 0: th; 1: ' and =%spellout-ordinal-verbose=; 100: , =%spellout-ordinal-verbose=; 1000: , <%spellout-cardinal-verbose< thousand>%%commas-o>; 1000000: , =%spellout-ordinal-verbose=; %spellout-ordinal-verbose: -x: minus >>; x.x: =#,##0.#=; Inf: infinitieth; 0: =%spellout-ordinal=; 100: <%spellout-numbering-verbose< hundred>%%and-o>; 1000: <%spellout-numbering-verbose< thousand>%%and-o>; 100000/1000: <%spellout-numbering-verbose< thousand>%%commas-o>; 1000000: <%spellout-numbering-verbose< million>%%commas-o>; 1000000000: <%spellout-numbering-verbose< billion>%%commas-o>; 1000000000000: <%spellout-numbering-verbose< trillion>%%commas-o>; 1000000000000000: <%spellout-numbering-verbose< quadrillion>%%commas-o>; 1000000000000000000: =#,##0=.;
Attachments
Change History
comment:1 Changed 21 months ago by pedberg
- Owner changed from anybody to grhoten
- Phase changed from dsub to rc
- Priority changed from assess to major
- Status changed from new to accepted
- Milestone changed from UNSCH to 32
comment:3 Changed 21 months ago by grhoten
- Status changed from accepted to reviewing
- Review set to sascha
comment:4 Changed 20 months ago by pedberg
I am integrating CLDR trunk into ICU, and these changes cause a rule parse failure (we end up with what appears to be a duplicate rule for value="100000"):
the old rules had <ruleset type="spellout-cardinal-verbose"> ... <rbnfrule value="100000" radix="1000">←← thousand[→%%commas→];</rbnfrule> <rbnfrule value="1000000">←← million[→%%commas→];</rbnfrule> ... but then the latter rule above was changed as follows, with a different value <rbnfrule value="100000">←← lakh[→%%commas→];</rbnfrule> which appears to conflict with the previous rule for <rbnfrule value="100000" radix="1000"> (now two rules for value="100000"). I think we should delete the previous rule: <rbnfrule value="100000" radix="1000">←← thousand[→%%commas→];</rbnfrule> correct?
comment:5 Changed 20 months ago by kent.karlsson14@…
Yeah...
But: Are you entirely sure that the "billion" rules should be removed? I'd triplecheck that...
comment:6 Changed 20 months ago by grhoten
India uses lakh and crore. Billions are not normally used.
comment:7 Changed 20 months ago by kent.karlsson14@…
A little web search (I know the numbers aren't very reliable; still):
"crore" site:.in Ungefär 4 070 000 resultat "billion" site:.in Ungefär 2 530 000 resultat (but quite a lot false positives) "hundred crore" site:.in Ungefär 11 000 resultat (incl. some false positives) "thousand crore" site:.in Ungefär 16 200 resultat "trillion" site:.in Ungefär 456 000 resultat "lakh crore" site:.in Ungefär 185 000 resultat
Ok, not all of these are supposedly "Indian English", despite the "site:.in", but in "ordinary" English.
Anyhow, it still seems to me that "... billion" (very common, even discounting false positives) may be preferred over "hundred crore"/"thousand crore", both seem comparatively rare, even in "Indian English".
"lakh crore" (same as "trillion") is surprisingly common though; but not covered by the rules above.