[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #11377(new)

Opened 3 months ago

Last modified 4 days ago

CLDR a way to correct anti-preformatted-superscript policies, reminder for foundries

Reported by: Marcel Schneider <charupdate@…> Owned by: anybody
Component: other Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:


Unicode has long been discouraging preformatted superscript letters in Latin script, although these are intrinsic to Latin script at least since they are in widespread use from the Middle Ages on. Commercially biased anti-preformatted-superscript policies benefit from a cognate tendency to (self-)deprive languages like French of an accurate interoperable digital representation, fueled by a century of typewriting. While Italian, Portuguese and Spanish have been granted a pair of ordinal indicators in Latin-1, French self-deprived itself even of the letter Œœ by deliberately lobbying against its inclusion, filling in the gaps with multiplication and division signs (U+00D7, U+00F7). While that was the action of an industrial standardization manager, our supreme langage authority can be identified as the main culprit by not giving Œœ and Ææ letter status, as did Denmark for Ææ, and ending up exacerbating terminological confusion by validating an administrative guideline that specifies Ææ and Œœ as equivalent of AE ae and OE oe, while every grammarian and the Body elsewhere state otherwise. No wonder that giving ᵉ, ʳ, ᵈ and ˢ the status of ordinal indicators was likewise neglected, making it easy in the aftermath to (deliberately) confuse them with a purely typographic ornament.

That is now to be corrected, as it is no longer decent to make Unicode partly unusable for French and English, and a small number of other Latin-script-using languages taking over the medieval tradition of abbreviating words by appending superscripted end-letters. Specifying the use of higher-level formatting, that is in reality a mere workaround for instance, is somewhat like specifying (as has been considered to do) that Ææ and Œœ should be generated by fine-tuning interletter-spacing, or by glyph-substitution in what would become the OpenType technology, best triggered (for said purpose) by an application control rather than by ZWJ.

While historically the ransom-note effect was produced when different typefaces and font-sizes were mixed up on a per-letter basis in that sort of criminal abuse of a typewriter or DTP emulation ahead of time, and Apple’s former San Francisco font made it a piece of art, its present-day avatar discussed here is a deliberate and fully Unicode-non-conformant plot to make superscripts useless in business correspondence and to shoo off inadvertent end-users. Such fonts are found in mainstream webmail UIs, with Gmail giving the example, while the default font in Microsoft Word (Calibri) has totally even-formed preformatted superscripts (see test run in http://www.unicode.org/mail-arch/unicode-ml/y2017-m01/0093.html ).

Unlike *new* currency symbols, Latin superscripts have a long history in Unicode, so where they are still unsupported, that is probably the effect of a conspiracy that must be broken up. Note that phoneticists neither appreciate that those letters are uneven in size and vertical alignment. There is no use where any of them should fall out of the range.

That is why CLDR should now weigh in to counter the adverse effects of almost three decades of biased policies with respect to Latin superscripts — an unprecedented oddity that has no equivalent in any other script.


Change History

comment:1 Changed 3 months ago by Marcel Schneider <charupdate@…>

Focus on Latin superscript small o

While the ordinal indicators have been thankfully admitted in the Others:numbers set of characters in use, the ᵒ of the “numéro” abbreviation “nᵒ  ” is not yet, while it is currently emulated using the degree sign, here next to superscript small o: °ᵒ. The degree sign is usually smaller than the preformatted Latin letter. The risk of ransom note effect is mitigated through its being alone of its sort in this use case, except in rare contexts where the abbreviation takes the plural form: “nᵒˢ” — occurring seldom because when the word is in plural, it often isn’t abbreviated anyway.

comment:2 Changed 3 weeks ago by mark

  • Component changed from main to other

comment:3 Changed 4 days ago by mark

  • Milestone changed from UNSCH to to-assess

Add a comment

Modify Ticket

as new

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.