[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #11235(accepted)

Opened 5 months ago

Last modified 32 hours ago

Various whitespaces not marked up while bidi controls are

Reported by: Marcel Schneider <charupdate@…> Owned by: anybody
Component: survey-backend Data Locale: fr
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

ticket:11173

ticket:7451

Description

CLDR ST forum post http://st.unicode.org/cldr-apps/v#forum/fr//27008 brought up that whitespace should be marked up in ST, as already bidi controls are marked up (e.g. <RLM>).

The case in ticket 11173 illustrates how important it is to be able to easily check presence of proper whitespace characters.

In the cited forum thread, Philippe Verdy suggests using colored boxes with identifier inside.

Maybe a color code would allow to unify standard text rendering and space identification, as commonly we need only U+202F with certain punctuations and between numbers and measurement units, and U+00A0 in some other cases such as a number followed by an assocoated word.

Please take this as a feature request.

Attachments

Change History

comment:1 Changed 5 months ago by Marcel Schneider <charupdate@…>

Given that ST already displays both standard appearance (line 1) and marking up invisible characters (line 2 in related items), marking up spaces may be implemented in an effortless way by simply adding gc=Zs to the handled categories if any (where there would then be already gc=Cc), and adding U+202F = "NNBSP" and U+00A0 = "NBSP" to the handled characters and short forms.

comment:2 follow-up: ↓ 4 Changed 4 months ago by Marcel Schneider <charupdate@…>

Would you please disclose which of the files contains the list of displayable invisible characters.

http://unicode.org/cldr/trac/browser/branches/surveytool2/c/genldml/genldml.h

We should add U+202F to be displayed as <NNBSP>, and U+00A0 as <NBSP>.
If I know the file, I may do the edit, and submit it here for inclusion in SURVEY TOOL.

Thanks in advance.

comment:3 Changed 4 months ago by srl

  • Data Locale changed from French and others to fr
  • Data Xpath https://unicode.org/cldr/trac/ticket/11173 deleted
  • Xref set to 11173 7451

comment:4 in reply to: ↑ 2 ; follow-up: ↓ 6 Changed 4 months ago by srl

Replying to Marcel Schneider <charupdate@…>:

Would you please disclose which of the files contains the list of displayable invisible characters.

http://unicode.org/cldr/trac/browser/branches/surveytool2/c/genldml/genldml.h

We should add U+202F to be displayed as <NNBSP>, and U+00A0 as <NBSP>.
If I know the file, I may do the edit, and submit it here for inclusion in SURVEY TOOL.

Thanks in advance.

it's function checkLRMarker in survey.js - something like https://unicode.org/cldr/trac/browser/trunk/tools/cldr-apps/WebContent/js/survey.js#L2574

this was implemented in ticket:7451 if you want to read the history there.

comment:5 follow-ups: ↓ 7 ↓ 8 Changed 4 months ago by mark

The committee agreed that this would be a good idea. If you want to submit a patch, that would help.

comment:6 in reply to: ↑ 4 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to srl:

it's function checkLRMarker in survey.js - something like https://unicode.org/cldr/trac/browser/trunk/tools/cldr-apps/WebContent/js/survey.js#L2574

this was implemented in ticket:7451 if you want to read the history there.

Thank you.
We might simply extend this to encompass NBSP and NNBSP. Another option I’ve tried first is to add another function, but that though cleaner would significantly complicate the implementation. If the same function can be extended, the additional couple of characters would be supported automatically.

comment:7 in reply to: ↑ 5 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to mark:

The committee agreed that this would be a good idea. If you want to submit a patch, that would help.

Thank you for accepting the scheme. I’ve tried to embed the addition directly into the existing function for a hopefully most streamlined implementation without any additional code elsewhere:

/**
 * Check if we need to display LRM/RLM marker, or to disambiguate NBSP/NNBSP
 * @param field: choice field to append if needed
 * @param dir:   direction of current locale (control float direction)
 * @param value: the value of votes (check &lrm; &rlm;)
 */
function checkLRmarker(field, dir, value){
        if (value) {
                if ( value.indexOf("\u200E") > -1 ||  value.indexOf("\u200F") > -1 ||  value.indexOf("\u00A0") > -1 ||  value.indexOf("\u202F") > -1 ) {
                        value = value.replace(/\u200E/g, "<span class=\"visible-mark\">&lt;LRM&gt;</span>")
                                     .replace(/\u200F/g, "<span class=\"visible-mark\">&lt;RLM&gt;</span>")
                                     .replace(/\u00A0/g, "<span class=\"visible-mark\">&lt;NBSP&gt;</span>")
                                     .replace(/\u202F/g, "<span class=\"visible-mark\">&lt;NNBSP&gt;</span>");
                        var lrm = document.createElement("div");
                        lrm.className = "lrmarker-container";
                        lrm.innerHTML = value;
                        field.appendChild(lrm);
                }
        }
}

Thanks for trying it out whether it solves the issue. If you prefer an extra function instead, I do have a draft but at this point I’m afraid it won’t be useful.

comment:8 in reply to: ↑ 5 ; follow-up: ↓ 11 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to mark:

If you want to submit a patch

I’m sorry not to be familiar with submitting pull requests (to SVN). Setting up the environment would take me a lot of time and I’d prefer using the deadline of July 10 for submitting more data via ST.

http://cldr.unicode.org/development/new-cldr-developers

comment:9 follow-up: ↓ 10 Changed 4 months ago by Marcel Schneider <charupdate@…>

After signing the CLA, I need to post the snippet again.

+/**
 * Check if we need to display LRM/RLM marker, or to disambiguate NBSP/NNBSP
 * @param field: choice field to append if needed
 * @param dir:   direction of current locale (control float direction)
 * @param value: the value of votes (check &lrm; &rlm;)
 */
function checkLRmarker(field, dir, value){
        if (value) {
                if ( value.indexOf("\u200E") > -1 ||  value.indexOf("\u200F") > -1 ||  value.indexOf("\u00A0") > -1 ||  value.indexOf("\u202F") > -1 ) {
                        value = value.replace(/\u200E/g, "<span class=\"visible-mark\">&lt;LRM&gt;</span>")
                                     .replace(/\u200F/g, "<span class=\"visible-mark\">&lt;RLM&gt;</span>")
                                     .replace(/\u00A0/g, "<span class=\"visible-mark\">&lt;NBSP&gt;</span>")
                                     .replace(/\u202F/g, "<span class=\"visible-mark\">&lt;NNBSP&gt;</span>");
                        var lrm = document.createElement("div");
                        lrm.className = "lrmarker-container";
                        lrm.innerHTML = value;
                        field.appendChild(lrm);
                }
        }
}

comment:10 in reply to: ↑ 9 ; follow-up: ↓ 12 Changed 4 months ago by srl

Replying to Marcel Schneider <charupdate@…>:

After signing the CLA, I need to post the snippet again.

Can you mention your github ID here so we can find the right CLA? thanks!

comment:11 in reply to: ↑ 8 ; follow-up: ↓ 13 Changed 4 months ago by srl

Replying to Marcel Schneider <charupdate@…>:

Replying to mark:

If you want to submit a patch

I’m sorry not to be familiar with submitting pull requests (to SVN)

No, there's no pull request possible via SVN. Attaching the file (as you did) is the right path.

comment:12 in reply to: ↑ 10 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to srl:

Can you mention your github ID here so we can find the right CLA? thanks!

My GitHub ID is "dispoclavier". On my dispoclavier account you’ll find a "charupdate" repo, but "charupdate" is outdated and needs to be discarded (too pretentious). Propositions for English will then be at "charakoe".
Sorry for not mentioning my ID sooner.

comment:13 in reply to: ↑ 11 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to srl:

Attaching the file (as you did) is the right path.

I prefer, as I don’t use cloud repositories for code. And I confess I’m unable to test this js code either.
But that doesn’t hinder to extend it, and now I’d suggest including more confusables: apostrophes and hyphens.
This snippet supersedes the previous one:

/**
 * Check if we need to display LRM/RLM marker, or to disambiguate confusables.
 * @param field: choice field to append if needed;
 * @param dir:   direction of current locale (control float direction);
 * @param value: the value of votes (check &lrm; &rlm;).
 */
function checkLRmarker(field, dir, value){
        if (value) {
                if (  value.indexOf("\u200E") > -1
                   || value.indexOf("\u200F") > -1
                   || value.indexOf("\u00A0") > -1
                   || value.indexOf("\u202F") > -1
                   || value.indexOf("\u02BB") > -1
                   || value.indexOf("\u02BC") > -1
                   || value.indexOf("\u02BD") > -1
                   || value.indexOf("\u2010") > -1
                   || value.indexOf("\u2011") > -1
                   || value.indexOf("\u2212") > -1
                   ) {
                        value = value.replace(/\u200E/g, "<span class=\"visible-mark\">&lt;LRM&gt;</span>")
                                     .replace(/\u200F/g, "<span class=\"visible-mark\">&lt;RLM&gt;</span>")
                                     .replace(/\u00A0/g, "<span class=\"visible-mark\">&lt;NBSP&gt;</span>")
                                     .replace(/\u202F/g, "<span class=\"visible-mark\">&lt;NNBSP&gt;</span>");
                                     .replace(/\u02BB/g, "<span class=\"visible-mark\">&lt;02BB&gt;</span>");
                                     .replace(/\u02BC/g, "<span class=\"visible-mark\">&lt;02BC&gt;</span>");
                                     .replace(/\u02BD/g, "<span class=\"visible-mark\">&lt;02BD&gt;</span>");
                                     .replace(/\u2010/g, "<span class=\"visible-mark\">&lt;HY&gt;</span>");
                                     .replace(/\u2011/g, "<span class=\"visible-mark\">&lt;NBHY&gt;</span>");
                                     .replace(/\u2212/g, "<span class=\"visible-mark\">&lt;MINUS&gt;</span>");
                        var lrm = document.createElement("div");
                        lrm.className = "lrmarker-container";
                        lrm.innerHTML = value;
                        field.appendChild(lrm);
                }
        }
}

Comments:

  • The non-breakable hyphen is mandatory in names like "St‑Michel", in case any such orthographies occur in CLDR.
  • The hyphen U+2010 is an encoding error. As far as I can see, this character is not used by industry, the hyphen glyph is normally mapped to U+002D (and even the character, as shown in confusables.txt), but depending on input methods it is a risk for CLDR data, and its accidental presence must trigger a warning.
  • The minus sign to be able to make sure it is present, even when font family and size make it confusable with U+002D.
  • The letter apostrophes because of their confusability with quotation marks, as they are mandatory in many names in CLDR.

comment:14 Changed 4 months ago by Marcel Schneider <charupdate@…>

Edit: The last point is not clear:

  • The letter apostrophes because of their confusability with quotation marks. U+02BB MODIFIER LETTER TURNED COMMA, U+02BC MODIFIER LETTER APOSTROPHE, or U+02BD MODIFIER LETTER REVERSED COMMA, are mandatory in many names in CLDR.

If there is still a problem, please let us know. You can also write up your own code and edit survey.js on the fly. Anyhow that should get fixed before the other submitters start voting, so that they see what they vote for.

comment:15 Changed 4 months ago by Marcel Schneider <charupdate@…>

Display of letter apostrophes

In an attempt to propose compact display, I’ve omitted "U+". Prefixed:

                                     .replace(/\u02BB/g, "<span class=\"visible-mark\">&lt;U+02BB&gt;</span>");
                                     .replace(/\u02BC/g, "<span class=\"visible-mark\">&lt;U+02BC&gt;</span>");
                                     .replace(/\u02BD/g, "<span class=\"visible-mark\">&lt;U+02BD&gt;</span>");

One could also invent short names like what is done for directional marks:

                                     .replace(/\u02BB/g, "<span class=\"visible-mark\">&lt;TLA&gt;</span>");
                                     .replace(/\u02BC/g, "<span class=\"visible-mark\">&lt;LA&gt;</span>");
                                     .replace(/\u02BD/g, "<span class=\"visible-mark\">&lt;RLA&gt;</span>");

That however grows increasingly confusing. Perhaps "M" for MODIFIER should be the first letter:

                                     .replace(/\u02BB/g, "<span class=\"visible-mark\">&lt;MLTC&gt;</span>");
                                     .replace(/\u02BC/g, "<span class=\"visible-mark\">&lt;MLA&gt;</span>");
                                     .replace(/\u02BD/g, "<span class=\"visible-mark\">&lt;MLRC&gt;</span>");

But since all these abbreviations seem to be non-standard, sticking with code points would be safest. I’d thought likewise also about HYPHEN, but having "SHY" but no "HY" nor "NBHY" must be due to an omission, or I’m ignoring those abbreviations.

comment:16 Changed 4 months ago by Marcel Schneider <charupdate@…>

Acknowledgement

All the above javascript code is based on an already present code section in survey.js.

comment:17 Changed 4 months ago by Marcel Schneider <charupdate@…>

Does anybody oppose to making changes to survey.js?

comment:18 Changed 4 months ago by Marcel Schneider <charupdate@…>

Pleading for high priority

There is no opposition, the problem is just that we need to make a case for high priority.

This particular feature of SURVEY TOOL is needed for submitters of French data to check whether the correct non-breaking whitespace is present in the data they will now vote for in a vetting cycle that is to run from July 11 (tomorrow Pacific Time) until July 24, 2018.

French uses NNBSP with a number of punctuation marks and this needs to be updated in the patterns. Furthermore NNBSP is used between a number and the measurement unit, and this would be mandatory in English too if practice had been updated in the Unicode era (but sadly it has scarcely been). French professionals are in advance because they are aware of non-breaking whitespaces in Unicode due to their typesetting spaced punctuations. Due to this strong craftmanship, French typeserters and Computer Aided Publishing specialists know about U+202F and about the difference between a justifying no-break space (U+00A0) and a non-justifying narrow one, the latter being U+202F NARROW NO-BREAK SPACE.

All submitters must be able to check readily what space is present, and many many data is to be corrected right now. That brings the need to display these characters along with some others like letter apostrophes in order to avoid time-consuming use of the browser console with "$0.innerText.charCodeAt(index).toString(16)".

Please be aware that there is a huge mass of data to vote for one by one with hundreds of whitespace-containing items to review.

Please be so kind end put the above or below code into survey.js instead of the already present function.

Thanks a lot in advance.

comment:19 Changed 4 months ago by Marcel Schneider <charupdate@…>

In the same vein we could add support for more whitespaces:

/**
 * Check if we need to display LRM/RLM marker, or to disambiguate confusables.
 * @param field: choice field to append if needed;
 * @param dir:   direction of current locale (control float direction);
 * @param value: the value of votes (check &lrm; &rlm;).
 */
function checkLRmarker(field, dir, value){
        if (value) {
                if (  value.indexOf("\u200E") > -1
                   || value.indexOf("\u200F") > -1
                   || value.indexOf("\u00A0") > -1
                   || value.indexOf("\u202F") > -1
                   || value.indexOf("\u02BB") > -1
                   || value.indexOf("\u02BC") > -1
                   || value.indexOf("\u02BD") > -1
                   || value.indexOf("\u2010") > -1
                   || value.indexOf("\u2011") > -1
                   || value.indexOf("\u2212") > -1
                   || value.indexOf("\u2002") > -1
                   || value.indexOf("\u2003") > -1
                   || value.indexOf("\u2004") > -1
                   || value.indexOf("\u2005") > -1
                   || value.indexOf("\u2006") > -1
                   || value.indexOf("\u2007") > -1
                   || value.indexOf("\u2008") > -1
                   || value.indexOf("\u2009") > -1
                   || value.indexOf("\u200A") > -1
                   ) {
                        value = value.replace(/\u200E/g, "<span class=\"visible-mark\">&lt;LRM&gt;</span>")
                                     .replace(/\u200F/g, "<span class=\"visible-mark\">&lt;RLM&gt;</span>")
                                     .replace(/\u00A0/g, "<span class=\"visible-mark\">&lt;NBSP&gt;</span>")
                                     .replace(/\u202F/g, "<span class=\"visible-mark\">&lt;NNBSP&gt;</span>");
                                     .replace(/\u02BB/g, "<span class=\"visible-mark\">&lt;U+02BB&gt;</span>");
                                     .replace(/\u02BC/g, "<span class=\"visible-mark\">&lt;U+02BC&gt;</span>");
                                     .replace(/\u02BD/g, "<span class=\"visible-mark\">&lt;U+02BD&gt;</span>");
                                     .replace(/\u2010/g, "<span class=\"visible-mark\">&lt;HY&gt;</span>");
                                     .replace(/\u2011/g, "<span class=\"visible-mark\">&lt;NBHY&gt;</span>");
                                     .replace(/\u2212/g, "<span class=\"visible-mark\">&lt;MINUS&gt;</span>");
                                     .replace(/\u2002/g, "<span class=\"visible-mark\">&lt;ENSP&gt;</span>");
                                     .replace(/\u2003/g, "<span class=\"visible-mark\">&lt;EMSP&gt;</span>");
                                     .replace(/\u2004/g, "<span class=\"visible-mark\">&lt;3/MSP&gt;</span>");
                                     .replace(/\u2005/g, "<span class=\"visible-mark\">&lt;4/MSP&gt;</span>");
                                     .replace(/\u2006/g, "<span class=\"visible-mark\">&lt;6/MSP&gt;</span>");
                                     .replace(/\u2007/g, "<span class=\"visible-mark\">&lt;FSP&gt;</span>");
                                     .replace(/\u2008/g, "<span class=\"visible-mark\">&lt;PSP&gt;</span>");
                                     .replace(/\u2009/g, "<span class=\"visible-mark\">&lt;THSP&gt;</span>");
                                     .replace(/\u200A/g, "<span class=\"visible-mark\">&lt;HSP&gt;</span>");
                        var lrm = document.createElement("div");
                        lrm.className = "lrmarker-container";
                        lrm.innerHTML = value;
                        field.appendChild(lrm);
                }
        }
}

comment:20 follow-up: ↓ 21 Changed 4 months ago by mark

  • Owner changed from anybody to Future
  • Status changed from new to accepted
  • Milestone changed from UNSCH to upcoming

comment:21 in reply to: ↑ 20 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to mark:

  • Owner changed from anybody to Future

[…]

  • Milestone changed from UNSCH to upcoming

When deprioritizing this easy-to-implement feature, please consider also the discussion (of yet uncertain result) in ticket:11255.

comment:22 Changed 4 months ago by Marcel Schneider <charupdate@…>

Replying to Marcel Schneider <charupdate@…>:

Pleading for high priority

There is no opposition, the problem is just that we need to make a case for high priority.

That is completely wrong. I wrote that after having been informed in response to private e-mail that it is not a question of opposing, just of prioritizing the feature and getting a ticket owner.

The reality is that a CLDR TC chair has accepted space display as a good idea and asked for a patch, and when the patch was ready to use in survey.js, dismissed the feature to an undefined future (ticket:11235#comment:20).

He could not changed his mind on his own behalf since that is incompatible with high intelligence. The feature was recognized as desirable, and I was given the opportunity to propose the way (or several ways) to display the problematic confusables, and care was taken to ask me to sign the CLA. And when all was ready to copy-paste into survey.js, following through was prohibited.

So let me speculate with apologies to Doug Ewell urging me repeatedly not to see malfeasance. Due to its skill-demanding punctuation typesetting scheme, the French language is taken hostage by a conspiracy of certain vendors, led presumbably by Adobe. They lobby to unsupport any narrow form of a no-break space in plain-text-based representations of French text, so that they can keep monopolizing this as a non-Unicode-encoded typographic custom feature in high-end software, especially InDesign. That would then be why U+2008 PUNCTUATION SPACE has been prevented (hindered, discarded) from being defined as non-breaking in Unicode, unlike U+2007 FIGURE SPACE used in exactly the same environment with nearly similar semantics (holding the place of non-present digits instead of that of the non-present decimal separator in old-style table layout, according to TUS).

By doing that, there was nothing in Unicode to fill the gap in a straightforward digital representation of French, until a decade later Unicode encoded U+202F NARROW NO-BREAK SPACE for Mongolian in 1999. This character was then ripped off by the graphic industry (other than the above-mentioned) to fill that gap, compromising the plan of the conspiracy. This would then be why by any means NNBSP is marginalized as special and academic (a fate it shares with NBSP), excluded from mainstream keyboards (however its usage in French is mentioned in TUS only since 2014), and deliberately deprived of any automated support in word processors (https://wiki.openoffice.org/wiki/Non_Breaking_Spaces_Before_Punctuation_In_French_(espaces_ins%C3%A9cables)#Exclusion_of_the_NARROW_NO-BREAK_SPACE_.28U.2B202F.29 makes several wrong statements we’d need to correct).

CLDR presumably received advice or pressions from the industry not to support NNBSP, hence the swing noted above; see also item http://st.unicode.org/cldr-apps/v#/fr/Locale_Name_Patterns/38d08336cae10c4e unanimously voted for NNBSP, but three subsequent items still not, despite of forum thread http://st.unicode.org/cldr-apps/v#forum/fr//27364. Due to this policy of not reflecting typesetting standards of supported locales even when referenced in TUS, CLDR contains a part of fake data.

comment:23 Changed 4 months ago by Marcel Schneider <charupdate@…>

Edit:

CLDR presumably received advice or pressions from the industry not to support NNBSP

CLDR presumably received advice, or is subjected to pressure, from the industry directing it not to support NNBSP

[…] CLDR contains a part of fake data.

[…] part of the data contained in CLDR, or hosted by CLDR, is fake data. / CLDR partly contains fake data.

Note: I’ll be submitting suggestions to clearly flag that data as such, unless the industry will allow (or is allowing) CLDR to rectify that data.

comment:24 Changed 4 months ago by Marcel Schneider <charupdate@…>

User demand

There is a real demand on user side for having ST display sensitive space characters in a distinguishable way.
See French ST forum thread about "Numbers | Symbols | group | group" starting at "[v34] 2018-07-05 14:32" and ending with the wish that in the future, a more user-friendly way of disambiguating spaces will be available than using "$0.innerText.charCodeAt(index).toString(16)" in the browser console.

comment:25 follow-up: ↓ 26 Changed 3 months ago by Marcel Schneider <charupdate@…>

Although the patch above would have saved much trouble in ST 34 if it had been copy-pasted into survey.js in time, the most straightforward solution for whitespace disambiguation seems to be a text-editor-like feature “Show White Space” in Gedit manner where U+00A0 is represented with U+25BE, and U+202F with U+25BF, while ordinary space is middle dot as usual. That markup is in a different color of course.

SurveyTool should then have a [Show Whitespace Markup] / [Hide Whitespace Markup] button in the topbar.

Note however the apostrophe / letter apostrophe disambiguation is not covered by that scheme. As a fall-off, letter apostrophe could be colored while whitespace is shown.

I’m not able to deliver the patch. Doing much coding for just one locale doesn’t seem worthwile, so please consider at least copy-pasting the above for a ready fix prior to ST 35.

comment:26 in reply to: ↑ 25 Changed 3 months ago by Marcel Schneider <charupdate@…>

Correcting comment 25:

a text-editor-like feature “Show White Space” in Gedit manner where U+00A0 is represented with U+25BE, and U+202F with U+25BF

U+00A0 ➔ U+25BF WHITE DOWN-POINTING SMALL TRIANGLE
U+202F ➔ U+25BE BLACK DOWN-POINTING SMALL TRIANGLE

comment:27 follow-up: ↓ 28 Changed 3 months ago by Marcel Schneider <charupdate@…>

Credit

The cited feature in Gedit is brought to us by the Draw Spaces plugin authored by Paolo Borelli, Steve Frécinaux, Ignacio Casal Quinteiro, Jim Campbell:

https://github.com/GNOME/gedit-plugins/blob/master/plugins/drawspaces/
https://github.com/GNOME/gedit-plugins/blob/master/help/C/draw-spaces.page

The Draw Spaces plugin is able to show only non-breakable spaces, out of a variety of whitespace characters. That might be the desirable behavior for SurveyTool.

comment:28 in reply to: ↑ 27 Changed 3 months ago by Marcel Schneider <charupdate@…>

Completing comment 27:

Credit

Igor Gnatenko has also contributed.

Conclusion

Finally, SurveyTool should become able to draw spaces, which brings us back to Philippe Verdy’s initial suggestion.

Additionally, ST should highlight U+02BC MODIFIER LETTER APOSTROPHE, and U+02BB and U+02BD as well.

The coding effort is higher when porting the Gedit plugin (under GNU GPL 2.0), so using the above patch would be more cost-effective, though heavier in UI as it adds a line and displays letter codes.

The following patch focuses on both non-breakable spaces and letter apostrophes, and uses triangles and highlighting. The letter apostrophes are simply replicated, and recognizable through the "visible mark" background color, which suffices to make aware that they are not punctuation marks.

/**
 * Check if we need to display LRM/RLM marker,
 * a non-breakable space or a letter apostrophe.
 * @param field: choice field to append if needed;
 * @param dir:   direction of current locale (control float direction);
 * @param value: the value of votes (check &lrm; &rlm;).
 */
function checkLRmarker(field, dir, value){
        if (value) {
                if (  value.indexOf("\u200E") > -1
                   || value.indexOf("\u200F") > -1
                   || value.indexOf("\u00A0") > -1
                   || value.indexOf("\u202F") > -1
                   || value.indexOf("\u02BB") > -1
                   || value.indexOf("\u02BC") > -1
                   || value.indexOf("\u02BD") > -1
                   ) {
                        value = value.replace(/\u200E/g, "<span class=\"visible-mark\">&lt;LRM&gt;</span>")
                                     .replace(/\u200F/g, "<span class=\"visible-mark\">&lt;RLM&gt;</span>")
                                     .replace(/\u00A0/g, "<span class=\"visible-mark\">\u25BF</span>")
                                     .replace(/\u202F/g, "<span class=\"visible-mark\">\u25BE</span>");
                                     .replace(/\u02BB/g, "<span class=\"visible-mark\">\u02BB</span>");
                                     .replace(/\u02BC/g, "<span class=\"visible-mark\">\u02BC</span>");
                                     .replace(/\u02BD/g, "<span class=\"visible-mark\">\u02BD</span>");
                      var lrm = document.createElement("div");
                        lrm.className = "lrmarker-container";
                        lrm.innerHTML = value;
                        field.appendChild(lrm);
                }
        }
}

comment:29 Changed 2 months ago by Marcel Schneider <charupdate@…>

Xref

Please see Philippe Verdy’s comment on ticket:11423#comment:3.

comment:30 Changed 4 weeks ago by pedberg

  • Milestone changed from upcoming to UNSCH

CLDR 34 BRS closing item, move all upcoming → UNSCH

comment:31 Changed 32 hours ago by mark

  • Owner changed from Future to anybody
View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.