[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #11217(accepted)

Opened 5 months ago

Last modified 3 days ago

change thousands separator from NBSP to NNBSP

Reported by: Stanislav Brabec <sbrabec@…> Owned by: anybody
Component: numbers Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

Description

CLDR Version 33 proposes to use NBSP as thousands separator in many locales[1]. However it is acceptable, it is visually wrong, as it may cause confusion for a group of three-digit numbers.

That is why typography textbooks as well as ISO 31-0[2], International Bureau of Weights and Measures, the International Union of Pure and Applied Chemistry and American Medical Association[3] recommend use of thin/narrow/small spaces instead of standard spaces.

NNBSP is the only UNICODE match for thin unbreakable space.

That is why I am proposing to change section "By-Type Chart: Numbers:Symbols group"[1] for mentioned locales from NBSP to NNBSP.

References:
[1] https://www.unicode.org/cldr/charts/33/by_type/numbers.symbols.html#a1ef41eaeb6982d
[2] https://en.wikipedia.org/wiki/ISO_31-0#Numbers
[3] https://en.wikipedia.org/wiki/Decimal_separator#Digit_grouping

Attachments

Change History

comment:1 Changed 5 months ago by mark

  • Owner changed from anybody to discuss
  • Status changed from new to accepted

Need to check that these are parse variants. Typographically the right approach. For discussion.

comment:2 Changed 5 months ago by jukkakk@…

The ISO 31 series of standards has been superseded, but the current standard ISO 80000-1 mentions “small space”, in clause 7.3.1: “To facilitate the reading of numbers with many digits, these may be separated into groups of three, counting from the decimal sign towards the left and the right. [...] Where such separation into groups of three is used, the groups shall be separated by a small space and not by a point or a comma or by any other means.”

However, this is just ISO reference notation for numbers, and the separators are optional (“may”). As we know, many languages, including English, do not use that notation. In the PDF form of the standard, the separator used in the examples is SPACE. I do not think “small space” is meant to specify a space character narrower than SPACE; rather, it is an informal expression for spacing between characters, described as “small” in some sense.

So I don’t think the standard requires or recommends the use of any particular space character. Neither does it exclude the use of specific space characters.

I think the issue is primarily typographic. For typographic reasons, the spacing between digits (when used) should be narrower than SPACE and non-breakable, though the non-breakability (as well as the amount of spacing) can be accomplished at higher protocol levels. So I think that as far as localization is concerned, the crucial issue is whether the grouping (thousands) separator is a space character or some visible character. Using U+202F is an interesting possibility, though.

comment:3 Changed 3 months ago by Carlos O'Donell <carlos@…>

If group separation is used for digits, the standard recommends a "small space", and that is directly in line with typographical conventions. I agree nothing is required by the standard at this point.

My reason for responding on this ticket is to discuss harmonization between glibc locales and CLDR data which feeds into libicu.

While I agree that digit grouping can be accomplished at higher level protocols, it is often easier to implement this directly as the text is being generated. For example consider a common use of a website being generated by PHP, there is no higher level protocol beyond the PHP (the browser has no rendering markup for numbers, but it could), and the low level implementation is either glibc's locales or ICU's libraries. Harmonizing the two has a lot of value so both interfaces produce similar formatting e.g. money_format() -> strfmon, or NumberFormatter::formatCurrency -> libicu.

We are already deploying NNBSP in glibc's locales, and we are looking to get input from CLDR experts on the reasons by NBSP was chosen there and if we could harmonize to NNBSP.

Is there any reason not to use NNBSP?

comment:4 Changed 3 weeks ago by mark

  • type changed from charts to unknown

comment:5 Changed 3 days ago by mark

  • Owner changed from discuss to anybody
  • Milestone changed from UNSCH to discuss
View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.