[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #10290(closed data: fixed)

Opened 2 months ago

Last modified 3 weeks ago

RBNF for Indian English

Reported by: grhoten Owned by: grhoten
Component: rbnf Data Locale:
Phase: rc Review: sascha
Weeks: Data Xpath:
Xref:

Description

When pronouncing numbers in Indian English, it's common to use most of the English words for RBNF, but a few Indian (Hindi/Urdu) words start creeping into the language for larger numbers. I've been told that all of India uses these words for cardinal numbers. So it would be ideal if this data was labeled as en_IN. While a locale like en-t-hi-h0-hybrid could technically be used, this is closer to how Swiss French and French in France differ by their number pronunciation.

Below are the proposed numbers. The main difference is that lakh and crore are used instead of million and billion for cardinals/numbering variants. Everything else should be the same as base English, including ordinals.

%%lenient-parse:
&[last primary ignorable ] << ' ' << ',' << '-' << '­';
%%2d-year:
0: hundred;
1: oh-=%spellout-numbering=;
10: =%spellout-numbering=;
%spellout-numbering-year:
-x: minus >>;
x.x: =#,##0.#=;
0: =%spellout-numbering=;
1010/100: << >%%2d-year>;
1100/100: << >%%2d-year>;
2000: =%spellout-numbering=;
2010/100: << >%%2d-year>;
2100/100: << >%%2d-year>;
3000: =%spellout-numbering=;
3010/100: << >%%2d-year>;
3100/100: << >%%2d-year>;
4000: =%spellout-numbering=;
4010/100: << >%%2d-year>;
4100/100: << >%%2d-year>;
5000: =%spellout-numbering=;
5010/100: << >%%2d-year>;
5100/100: << >%%2d-year>;
6000: =%spellout-numbering=;
6010/100: << >%%2d-year>;
6100/100: << >%%2d-year>;
7000: =%spellout-numbering=;
7010/100: << >%%2d-year>;
7100/100: << >%%2d-year>;
8000: =%spellout-numbering=;
8010/100: << >%%2d-year>;
8100/100: << >%%2d-year>;
9000: =%spellout-numbering=;
9010/100: << >%%2d-year>;
9100/100: << >%%2d-year>;
10000: =%spellout-numbering=;
%spellout-numbering:
-x: minus >>;
Inf: infinity;
NaN: not a number;
0: =%spellout-cardinal=;
%spellout-numbering-verbose:
-x: minus >>;
Inf: infinity;
NaN: not a number;
0: =%spellout-cardinal-verbose=;
%spellout-cardinal:
-x: minus >>;
x.x: << point >>;
Inf: infinite;
NaN: not a number;
0: zero;
1: one;
2: two;
3: three;
4: four;
5: five;
6: six;
7: seven;
8: eight;
9: nine;
10: ten;
11: eleven;
12: twelve;
13: thirteen;
14: fourteen;
15: fifteen;
16: sixteen;
17: seventeen;
18: eighteen;
19: nineteen;
20: twenty[->>];
30: thirty[->>];
40: forty[->>];
50: fifty[->>];
60: sixty[->>];
70: seventy[->>];
80: eighty[->>];
90: ninety[->>];
100: << hundred[ >>];
1000: << thousand[ >>];
100000:  << lakh[ >>];
10000000: << crore[ >>];
1000000000000: << trillion[ >>];
1000000000000000: << quadrillion[ >>];
1000000000000000000: =#,##0=;
%%and:
1: ' and =%spellout-cardinal-verbose=;
100: ' =%spellout-cardinal-verbose=;
%%commas:
1: ' and =%spellout-cardinal-verbose=;
100: , =%spellout-cardinal-verbose=;
1000: , <%spellout-cardinal-verbose< thousand[>%%commas>];
1000000: , =%spellout-cardinal-verbose=;
%spellout-cardinal-verbose:
-x: minus >>;
x.x: << point >>;
Inf: infinite;
NaN: not a number;
0: =%spellout-numbering=;
100: << hundred[>%%and>];
1000: << thousand[>%%and>];
100000:  << lakh[>%%commas>];
10000000: << crore[>%%commas>];
1000000000000: << trillion[>%%commas>];
1000000000000000: << quadrillion[>%%commas>];
1000000000000000000: =#,##0=;
%%tieth:
0: tieth;
1: ty-=%spellout-ordinal=;
%%th:
0: th;
1: ' =%spellout-ordinal=;
%spellout-ordinal:
-x: minus >>;
x.x: =#,##0.#=;
Inf: infinitieth;
0: zeroth;
1: first;
2: second;
3: third;
4: fourth;
5: fifth;
6: sixth;
7: seventh;
8: eighth;
9: ninth;
10: tenth;
11: eleventh;
12: twelfth;
13: =%spellout-numbering=th;
20: twen>%%tieth>;
30: thir>%%tieth>;
40: for>%%tieth>;
50: fif>%%tieth>;
60: six>%%tieth>;
70: seven>%%tieth>;
80: eigh>%%tieth>;
90: nine>%%tieth>;
100: <%spellout-numbering< hundred>%%th>;
1000: <%spellout-numbering< thousand>%%th>;
1000000: <%spellout-numbering< million>%%th>;
1000000000: <%spellout-numbering< billion>%%th>;
1000000000000: <%spellout-numbering< trillion>%%th>;
1000000000000000: <%spellout-numbering< quadrillion>%%th>;
1000000000000000000: =#,##0=.;
%%and-o:
0: th;
1: ' and =%spellout-ordinal-verbose=;
100: ' =%spellout-ordinal-verbose=;
%%commas-o:
0: th;
1: ' and =%spellout-ordinal-verbose=;
100: , =%spellout-ordinal-verbose=;
1000: , <%spellout-cardinal-verbose< thousand>%%commas-o>;
1000000: , =%spellout-ordinal-verbose=;
%spellout-ordinal-verbose:
-x: minus >>;
x.x: =#,##0.#=;
Inf: infinitieth;
0: =%spellout-ordinal=;
100: <%spellout-numbering-verbose< hundred>%%and-o>;
1000: <%spellout-numbering-verbose< thousand>%%and-o>;
100000/1000: <%spellout-numbering-verbose< thousand>%%commas-o>;
1000000: <%spellout-numbering-verbose< million>%%commas-o>;
1000000000: <%spellout-numbering-verbose< billion>%%commas-o>;
1000000000000: <%spellout-numbering-verbose< trillion>%%commas-o>;
1000000000000000: <%spellout-numbering-verbose< quadrillion>%%commas-o>;
1000000000000000000: =#,##0=.;

Attachments

Change History

comment:1 Changed 2 months ago by pedberg

  • Owner changed from anybody to grhoten
  • Phase changed from dsub to rc
  • Priority changed from assess to major
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 32

comment:2 Changed 2 months ago by fredrik

  • Cc fredrik added

comment:3 Changed 2 months ago by grhoten

  • Status changed from accepted to reviewing
  • Review set to sascha

comment:4 Changed 5 weeks ago by pedberg

I am integrating CLDR trunk into ICU, and these changes cause a rule parse failure (we end up with what appears to be a duplicate rule for value="100000"):

the old rules had
    <ruleset type="spellout-cardinal-verbose">
        ...
        <rbnfrule value="100000" radix="1000">←← thousand[→%%commas→];</rbnfrule> 
        <rbnfrule value="1000000">←← million[→%%commas→];</rbnfrule>
       ...
 but then the latter rule above was changed as follows, with a different value
        <rbnfrule value="100000">←← lakh[→%%commas→];</rbnfrule>
which appears to conflict with the previous rule for <rbnfrule value="100000" radix="1000">
(now two rules for value="100000"). I think we should delete the previous rule:
       <rbnfrule value="100000" radix="1000">←← thousand[→%%commas→];</rbnfrule> 
correct?

comment:5 Changed 5 weeks ago by kent.karlsson14@…

Yeah...

But: Are you entirely sure that the "billion" rules should be removed? I'd triplecheck that...

comment:6 Changed 5 weeks ago by grhoten

India uses lakh and crore. Billions are not normally used.

comment:7 Changed 4 weeks ago by kent.karlsson14@…

A little web search (I know the numbers aren't very reliable; still):

"crore" site:.in		Ungefär 4 070 000 resultat
"billion" site:.in		Ungefär 2 530 000 resultat	(but quite a lot false positives)
"hundred crore" site:.in	Ungefär    11 000 resultat	(incl. some false positives)
"thousand crore" site:.in	Ungefär    16 200 resultat
"trillion" site:.in		Ungefär   456 000 resultat
"lakh crore" site:.in		Ungefär   185 000 resultat

Ok, not all of these are supposedly "Indian English", despite the "site:.in", but in "ordinary" English.

Anyhow, it still seems to me that "... billion" (very common, even discounting false positives) may be preferred over "hundred crore"/"thousand crore", both seem comparatively rare, even in "Indian English".

"lakh crore" (same as "trillion") is surprisingly common though; but not covered by the rules above.

comment:8 Changed 3 weeks ago by sascha

  • Status changed from reviewing to closed
  • Resolution set to fixed
View

Add a comment

Modify Ticket

Action
as closed
Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.