Re: NNBSP

From: Marcel Schneider via Unicode <unicode_at_unicode.org>
Date: Sat, 19 Jan 2019 01:05:31 +0100

On 18/01/2019 23:46, Shawn Steele wrote:
>
> *>> *Keeping these applications outdated has no other benefit than providing a handy lobbying tool against support of NNBSP.
>
> I believe you’ll find that there are some French banks and other institutions that depend on such obsolete applications (unfortunately).
>
If they are obsolete apps, they don’t use CLDR / ICU, as these are designed for up-to-date and fully localized apps. So one hassle is off the table.
>
> Additionally, I believe you’ll find that there are many scenarios where older applications and newer applications need to exchange data.  Either across the network, the web, or even on the same machine.  One app expecting NNBSP and another expecting NBSP on the same machine will likely lead to confusion.
>
I didn’t look into these date interchanges but I suspect they won’t use any thousands separator at all to interchange data. The group separator is only for display and print, and there you may wish to use a compat library for obsolete apps, and a newest library for apps with Unicode support. If an app is so obsolete it will keep working without new data from ICU.
>
> This could be something a “new” app running with the latest & greatest locale data and trying to import the legacy data users had saved on that app.  Or exchanging data with an application using the system settings which are perhaps older.
>
Again I don’t believe that apps are storing numbers with thousands separators in them. Not even spreadsheet software does do that. I say not even because these are high-end apps with latest locale data expected.

Sorry you did skip this one:

>> What are all these expected to do while localized with scripts outside Windows code pages?

Indeed that is the paradox, that Tirhuta users are entitled to use correct display with newest data, while Latin users are bothered indefinitely with old data and legacy display.
>
> >> Also when you need those apps, just tailor your French accordingly.
>
> Having the user attempt to “correct” their settings may not be sufficient to resolve these discrepancies because not all applications or frameworks properly consider the user overrides on all platforms.
>
Not the user. I’m addressing your concerns as coming from the developer side. I meant you should use the data as appropriate, and if a character is beyond support, just replace it for convenience.
>
> >> That should not impact all other users out there interested in a civilized layout.
>
> I’m not sure that the choice of the word “civilized” adds value to the conversation.
>
That is to express in a mouthful of English what user feedback is or can be, even if not all the time. Users are complaining about quotation marks spaced off too far when typeset with NBSP like Word does. It’s really ugly they say. NBSP is a character with precise usage, it’s not a one-size-fits-all. BTW as you are in the job, why does Word not provide an option with a checkbox letting the user set the space as desired? NBSP or NNBSP.
>
>   We have pretty much zero feedback that the OS’s French formatting is “uncivilized” or that the NNBSP is required for correct support.
>
That is, at some point users stop submitting feedback when they see of how little use it is spending time to post it. From the pretty much zero you may wish to pick the one or two you get, guessing that for one you get there are one thousand other users out there having the same feedback but not submitting it. One thousand or one million, it’s hard to be precise…
>
> >> As long as SegoeUI has NNBSP support, no worries, that’s what CLDR data is for.
>
> For compatibility, I’d actually much prefer that CLDR have an alt “best practice” field that maintained the existing U+00A0 behavior for compatibility, yet allowed applications wanting the newer typographic experience to opt-in to the “best practice” alternative data.  As applications became used to the idea of an alternative for U+00A0, then maybe that could be flip-flopped and put U+00A0 into a “legacy” alt form in a few years.
>
You dont need that field in CLDR. Here’s how it works: Take the locale data, search-and-replace all NNBSP with NBSP, and here’s the library you’ll use.
Because NNBSP is not only in the group separator. I’d suggest to download common/main/fr.xml and check all instances of NNBSP. The legacy apps you’re referring to don’t use that data for sure. That data is for fine high-end apps and for user interfaces of Windows and any other OS. If you want your employer be well-served, you’d rather prefer the correct data, not legacy fallbacks.
>
> Normally I’m all for having the “best” data in CLDR, and there are many locales that have data with limited support for whatever reasons.  U+00A0 is pretty exceptional in my view though, developers have been hard-coding dependencies on that value for ½ a century without even realizing there might be other types of non-breaking spaces.  Sure, that’s not really the best practice, particularly in modern computing, but I suspect you’ll still find it taught in CS classes with little regard to things like NNBSP.
>
There have been threads about Unicode in CS curricula. I don’t believe that teachers would be doing any good to their students by training them to ignore Unicode. These people would be unresponsive through not preparing their students for real life. But I won’t base any utterings on mere suspicions.

BTW Latin-1 did not exist 50 years ago. As a rough guess it has come up in the early eighties, and NBSP with it, but I may be wrong.

The point in sticking with old charsets is, again, to deny Unicode support to one third of mankind. I don’t think that this is doing any good.

Marcel
Received on Fri Jan 18 2019 - 18:05:57 CST

This archive was generated by hypermail 2.2.0 : Fri Jan 18 2019 - 18:05:57 CST