[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #7833(accepted data)

Opened 3 years ago

Last modified 2 years ago

Redundant plural rules for Tagalog and Filipino ?

Reported by: verdy_p@… Owned by: mark
Component: plurals Data Locale: tl, fil
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

Description

The current plural rules for Tagalo and Filipino are unnecessarily redundant:

one:

v = 0 and i = 1,2,3 or
v = 0 and i % 10 != 4,6,9 or
v != 0 and f % 10 != 4,6,9

The condition alternative (for integers 1;2;3) will be true when the second alternative is also true (integers whose unit digits is not 4;6;9).

LDML specifications says that "plural rules must be mutually exclusive" (to be "self-contained and not depend on the ordering" of the syntax), this is not true for the two first alternatives.

The first alternative is not necessary at all! The rules are equivalent to

one:

v = 0 and i % 10 != 4,6,9 or
v != 0 and f % 10 != 4,6,9;

Now the distinction is only if there are visible fractions, in which case the last visible digit of the fraction is used instead of the unit. In summary, all that matters is the last digit displayed either in the unit, or in fractions and it is singular (one) for digits 0,1,2,3;5,7,8 and plural (other) is used when that last digit is 4,6,9;

1.4 is plural (last digit is 4)
1.40 is singular (last digit is 0)

May be we could have a type of integer operand "u" containing the value of the last displayed component (the integer part when there's no fraction

displayed : operand
"0" : u=0 (singular)
"0.00" : u=0 (singular)
"0.5" : u=5 (singular)
"0.5004" : u=5004 (plural)
"1" : u=1 (singular)
"4" : u=4 (plural)
"1.4" : u=4 (plural)
"1.40" : u=40 (singular)

In which case the Tagalog/Filipino rule reduces to a single condition (completed below by samples):

one:

u % 10 != 4, 6, 9
@integer 0~3, 5, 7, 8, 10~13, 15, 17, 18, 20, 21, 100, 1000, 10000, 100000, 1000000, ...
@decimal 0.0~0.3, 0.5, 0.7, 0.8, 1.0~1.3, 1.5, 1.7, 1.8, 2.0, 2.1, 10.90, 100.01, 1000.002, 10000.0003, 100000.00005, 1000000.000007, ...;

other:

@integer 4, 6, 9, 14, 16, 19, 24, 104, 1006, 10009, 100004, 1000006, ...
@decimal 0.4, 0.6, 0.9, 1.4, 1.6, 1.9, 2.4, 2.6, 10.9, 100.04, 1000.006, 10000.0009, 100000.000004, 1000000.0000006, ...

I suspect this is more complex than that and there's a missing "and" clause for the 1st condition, or if "v=0" in the 1st alternative should have been dropped (to match independantly of the presence of visible fractions).

Attachments

Change History

comment:1 Changed 3 years ago by mark

  • Owner changed from anybody to mark
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 27dvet

comment:2 Changed 3 years ago by markus

  • Phase set to dvet
  • Milestone changed from 27dvet to 27

comment:3 Changed 3 years ago by Eemeli Aro <eemeli@…>

Other locales for which the symbol u would simplify the rules include at least bs/hr/sh/sr, dsb/hsb, lv/prg, and mk.

That's in fact every use of the symbol f, with the exception of lt's "many", which could equivalently be expressed using the symbol t.

Here are some of the simplifications that could be made with u:

bs/hr/sh/sr now:

"one": "v = 0 and i % 10 = 1 and i % 100 != 11 or f % 10 = 1 and f % 100 != 11"
"few": "v = 0 and i % 10 = 2..4 and i % 100 != 12..14 or f % 10 = 2..4 and f % 100 != 12..14"

bs/hr/sh/sr with u:

"one": "u % 10 = 1 and u % 100 != 11"
"few": "u % 10 = 2..4 and u % 100 != 12..14"

dsb/hsb now:

"one": "v = 0 and i % 100 = 1 or f % 100 = 1"
"two": "v = 0 and i % 100 = 2 or f % 100 = 2"
"few": "v = 0 and i % 100 = 3..4 or f % 100 = 3..4"

dsb/hsb with u:

"one": "u % 100 = 1"
"two": "u % 100 = 2"
"few": "u % 100 = 3..4"

mk now:

"one": "v = 0 and i % 10 = 1 or f % 10 = 1"

mk with u:

"one": "u % 10 = 1"

comment:4 Changed 3 years ago by mark

  • Priority changed from assess to medium

comment:5 Changed 3 years ago by verdy_p@…

The reason of this "strange" thing that the same value can be singular or plural comes from the way the numbers are spelled orally when there are visible fractions : the integer part has its own singular/plural rule (omitted when digits are written, because the decimal separator is just a invariant symbol), then the decimal separator is pronounced, then the fractional part folloved by the unit that takes its plural rule separately.

In other words, when fractions are written (even if they are just zeroes), these fractions have the pritory.

For this reason, a number displayed as "10" may be plural in some language, when "10.0" could be singular because only "0" is considered ; as well "10.1" and "10.10" would have different plurals (just discard the "10." part, consider "1" and "10").

This has a side effect when numbers can be formatted with a variable precision. In those languages you need to set explicitly the precision for fractions if the word for the unit following it is fixed and does not depend on the value. But ideally the plural rules in CLDR should allow choosing the correct word depending on the evaluation of the formatted number (independantly of its initial internal binary value before formatting it to a string, because formatting can generate roundings).

comment:6 Changed 3 years ago by mark

  • Phase changed from dvet to dsub
  • Milestone changed from 27 to 28

comment:7 Changed 3 years ago by markus

  • Type set to data

comment:8 Changed 3 years ago by srl

  • Status changed from assigned to accepted

comment:9 Changed 2 years ago by mark

  • Milestone changed from 28 to 29

After data submission, the priority for these drops, so moving to the start of the next cycle.

comment:10 Changed 2 years ago by verdyp@…

This bug was submitted long before the Data submission ans has already
postponed several times. Apparently you don't seem to take the initial
request seriously and when you delay it to another future branch it is
forgotten each time.
Is it so complex to handle?

comment:11 Changed 2 years ago by emmons

  • Milestone changed from 29 to upcoming

Auto move of all 29 -> upcoming

View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.