Iʼm aware that this thread is getting lengthy and (supposedly) tiresome.
Therefore, I wouldnʼt have sent this to the List today.
I really wanted to make a break and come back later.
However, with respect to the consequences of the result of this issue for millions
of end-users, and the imminence of the French keyboard standardization these months,
I acknowledge to be given the opportunity to keep discussing on-list.
***Disclaimer*** Iʼm not a part of the French keyboard standard WG, and
Iʼm talking on my own behalf, in civic responsiveness.
On Sun, 15 Jan 2017 10:15:33 -0800, Asmus Freytag wrote:
>
[Quoted mail]
>
> Contrary to your assertion about fonts elsewhere, the poor rendering of
> subscripts/superscripts that I reported to you is based on the fact that
> the characters are missing, but that the glyphs are not laid out as
> running text.
To date, as far as I know, the only domain where superscripts and subscripts are
mandatory in general text are abbreviations of numerals, titles, entities,
measurement units, chemical compounds and so on, using Western Arabic digits and
Latin superscript lowercase. Iʼm quite sure that no other scripts do have this
typographical convention, that is a part of an old discipline called
“orthotypography.” While I was wrong mixing it up with orthography, the outstanding
importance of these rules for unambiguous representation of text calls for special
treatment in practice and in the Unicode Standard.
In these ranges, one character is still missing because the UTC has refused to
encode *LATIN SUPERSCRIPT SMALL LETTER Q, aka *MODIFIER LETTER SMALL Q.
This has little incidence on general practice.
The main challenge outside Unicode is the availability of the related glyphs in
current fonts, as well as their consistency. To date, almost all webmails propose
only fonts where they are designed in an intentionally inconsistent way, supposedly
to make them unusable for accurate display: The 'ⁱ' is always far too high, and
the 'ⁿ' is too bold and with random vertical alignment. In my opinion, the legacy
status of these two is used as a fake explanation; compare with the inconsistent
design of '⁶' and '⁹' in some fonts, along with that of '⁰', while there is no
excuse of “legacy,” unlike for '¹', '²' and '³', where “legacy” is equally abused
to mess up the typefaces. This applies as well to most other fonts. The only
correct font-family Iʼve found so far is Calibri. Consistently, this is the body
font in the default template of Microsoft Word.
>
> When viewing with monospaced fonts, the separation between glyphs
> corresponds to the spacing of the full-size characters. When using
> formatting (styling) the superscripted text is in a smaller font size,
> reducing the spacing between characters, so that strings of them look
> like ordinary text again and not s p a c e d o u t.
Iʼm facing this issue when writing drafts in my text editor, where however Iʼm able
to set the font to any value, including Calibri. Displaying this in Calibri allows
to appreciate the consistent and running-text-like display of the superscripts:
// This is ᵒʳᵈⁱⁿᵃʳʸ ᵗᵉˣᵗ ˢᵉᵗ ⁱⁿ ᵁⁿⁱᶜᵒᵈᵉ ᴸᵃᵗⁱⁿ ˢᵘᵖᵉʳˢᶜʳⁱᵖᵗ ˢᵐᵃˡˡ ˡᵉᵗᵗᵉʳˢ ᵃⁿᵈ ᵗʷᵒ ᶜᵃᵖⁱᵗᵃˡ ˡᵉᵗᵗᵉʳˢ
// This is the range: ᵃᵇᶜᵈᵉᶠᵍʰⁱʲᵏˡᵐⁿᵒᵖ ^q_unavailableʳˢᵗᵘᵛʷˣʸᶻ¹²³⁴⁵⁶⁷⁸⁹⁰₁₂₃₄₅₆₇₈₉₀
This is how a complete and Unicode conformant typeface is supposed to work.
In practice, this turns out to be implemented far, far more than U+2044.
>
> I'm not going to spend much more time on this discussion.
When I launched this discussion on December 28, 2016, I naively believed that
this time, the matter would be quickly settled, and I could go on being more
productive on developing the keyboard layouts and documenting them.
Now that this thread still hasnʼt come to an even halfway useful result,
I need to make one more attempt.
The goal is to get Unicode accept the fact that people use superscript letters
in French, and super/sub scripts in vulgar fractions, and have them on their
keyboards, and that these people are not considered as hackers, but as making
a reasonable, thoughtful and responsive use of the Standard.
That is not a matter of “value inversion,” but of correcting a particular
design principle that was misled and biased under a (hypothetically) strong
influence of *extrinsic* factors from the beginning on (see point 3 below).
Itʼs good to know about the counter-arguments that may be figured out, so Iʼm
grateful to all who were so kind to respond. What bothers me, is that there is
still so much persistent opposition; and what makes me fear the worse, is that
the arguments raised against the general use of preformatted characters are
so biased and fallacious, unlike any normal-time reasoning:
1) Missing font support as an argument against the use of a character has never,
never been the way Unicode worked, so far as Iʼve been given the opportunity
to understand something of Unicode till now.
2) This missing font support is mostly a consequence of the Unicode strategy on
these characters: Discouraging their use and even misnaming them intentionally
in an inconsistent manner (from an overall point of view), Unicode drove
a significant part of the font designers away from adding them completely and
with a consistent design, and from implementing combining marks support for
these characters.
3) This strategy is biased from the beginning on, as it goes against the user
preferences of Latin script using countries, while AFAIK all countries
using other scripts are unconcerned because they actually donʼt *use*
superscripting in such an *extensive* way. Please correct me If Iʼm wrong.
Consequently, there would be *nobody* asking for more (except the already
discussed completion of some ranges of Latin script). This strategy of shooing
users (and their developers) away from using preformatted letters and digits
seems to aim nothing serious than support of software vendorsʼ marketing
strategies, despite of the software not needing poor character support based
(and poor keyboard layout based) marketing.
> Using code points
> "against the grain" that is, in contradiction to the way their use was
> intended when they were encoded means that you are going to run into many
> issues based on font vendors and implementers expectations on how users
> would follow the conventions suggested in the Unicode Standard.
Iʼve got the news that Edge neither still doesnʼt make OpenType fonts work for
U+2044. One might wonder however whether the users should conform to the Standard
litterally while even Microsoft donʼt. Iʼm not here to post feature requests to
the attention of Microsoft any longer. My actual suggestions are perhaps a bit
more complex than that. I just wish that the Unicode policy wrt superscripts
become more user-centered, more user-friendly.
The core issue is the use of these letters in current text in some languages
that need them to apply a typographic convention that is close to orthography.
Superscripting is a far, far stronger requirement than all other formatting
conventions, as it can affect the spelling of the grammatical entity.
Weʼre facing strong demands on user side relayed by standards bodies from the
early times on, when ordinal indicators were first encoded as a part of Latin-1.
Today most users still type a degree sign to emulate a superscript o, and the
French NB (that Iʼm not a part of, nor am y a member of the keyboard standard WG)
wishes an ordinal indicator on the keyboard to represent the most common ordinal
indicator in French: "ᵉ".
>
> Your discussion of the support of the fraction slash (with regular digits)
> across fonts is potentially more useful -- bringing attention to this issue
> could bring font vendors to perhaps update earlier fonts to support the
> correct conventions for 2044 (which incidentally post dated the design of
> many popular fonts).
This is relatively important, but it is far outweighed by the ordinal indicator
issue, and along with it, the need to stabilize superscript abbreviations.
>
> In other words, there's no need to "fix" the character encoding, but much
> need to make sure that what's in the character encoding (and its associated
> conventions) is actually supported as intended.
Additionally, I now suggest to add an informative alias to each one of the
(intentionally) misnamed characters. This “MODIFIER LETTER” disguise of the true
*LATIN SUPERSCRIPT LETTERs seems to me a twisted trick to make inadvertant people
believe that hereʼs a thing to insiders that is completely useless to other people.
The truth happens to show up wherever the editorial committee (as well as
anybody else) can afford to feel free to write their own, unbiased language:
[Iʼm highlighting with uppercase]
@ Latin superscript modifier letters
@+ See also SUPERSCRIPT LATIN LETTERS in the Spacing Modifier Letters block starting at 02B0.
1D2C MODIFIER LETTER CAPITAL A
...
I think that the "MODIFIER LETTER" labeling of these characters is not
straightforward enough for a standard who claims that the character names are
mere identifiers. This is an example of how the identifiers were (ab)used as
descriptors, to carry prescriptions and corporate preferences on how to use or
not to use the repertoire.
When Iʼm back writing up some keyboard documentation, I really would like to
be able to deliver a better image of Unicode – and of Microsoft – than that one.
Please help me improve my communication, and make Unicode a user-centered standard.
Below are the proposed additions, that Iʼd like to submit to your kind review
prior to posting them with the Contact Form.
Regards,
Marcel
NamesList snippets with additional informative aliases providing straightforward
character identifiers, and some comment lines:
(Original file:
http://www.unicode.org/Public/UCD/latest/ucd/NamesList.txt
)
@@ 02B0 Spacing Modifier Letters 02FF
@+ Superscript and subscript letters were not intended to replace markup, but they are for use where super/sub scripting is important in
plain text, or formatting is inappropriate.
@ Latin superscript modifier letters
@+ "modifier letter small" stands for "latin superscript small letter", and "modifier letter small capital" for "latin letter small capital".
x (superscript latin small letter i - 2071)
x (superscript latin small letter n - 207F)
02B0 MODIFIER LETTER SMALL H
= latin superscript small letter h
* aspiration
# <super> 0068
02B1 MODIFIER LETTER SMALL H WITH HOOK
= latin superscript small letter h with hook
* breathy voiced, murmured
x (latin small letter h with hook - 0266)
x (combining diaeresis below - 0324)
# <super> 0266
02B2 MODIFIER LETTER SMALL J
= latin superscript small letter j
* palatalization
x (combining palatalized hook below - 0321)
# <super> 006A
02B3 MODIFIER LETTER SMALL R
= latin superscript small letter r
# <super> 0072
02B4 MODIFIER LETTER SMALL TURNED R
= latin superscript small letter turned r
x (latin small letter turned r - 0279)
# <super> 0279
02B5 MODIFIER LETTER SMALL TURNED R WITH HOOK
= latin superscript small letter turned r with hook
x (latin small letter turned r with hook - 027B)
# <super> 027B
02B6 MODIFIER LETTER SMALL CAPITAL INVERTED R
= latin letter small capital inverted r
* preceding four used for r-coloring or r-offglides
x (latin letter small capital inverted r - 0281)
# <super> 0281
02B7 MODIFIER LETTER SMALL W
= latin superscript small letter w
* labialization
x (combining inverted double arch below - 032B)
# <super> 0077
02B8 MODIFIER LETTER SMALL Y
= latin superscript small letter y
* palatalization
* common Americanist usage for 02B2
# <super> 0079
[…]
@ Additions based on 1989 IPA
02DE MODIFIER LETTER RHOTIC HOOK
* rhotacization in vowel
* often ligated: 025A = 0259 + 02DE; 025D = 025C + 02DE
02DF MODIFIER LETTER CROSS ACCENT
* Swedish grave accent
02E0 MODIFIER LETTER SMALL GAMMA
= latin superscript small letter gamma
* these modifier letters are occasionally used in transcription of affricates
# <super> 0263
02E1 MODIFIER LETTER SMALL L
= latin superscript small letter l
# <super> 006C
02E2 MODIFIER LETTER SMALL S
= latin superscript small letter s
# <super> 0073
02E3 MODIFIER LETTER SMALL X
= latin superscript small letter x
# <super> 0078
02E4 MODIFIER LETTER SMALL REVERSED GLOTTAL STOP
= latin superscript letter reversed glottal stop
# <super> 0295
[…]
@ Latin superscript modifier letters
@+ See also superscript Latin letters in the Spacing Modifier Letters block starting at 02B0.
1D2C MODIFIER LETTER CAPITAL A
= latin superscript capital letter a
# <super> 0041
1D2D MODIFIER LETTER CAPITAL AE
= latin superscript capital letter ae
# <super> 00C6
1D2E MODIFIER LETTER CAPITAL B
= latin superscript capital letter b
# <super> 0042
1D2F MODIFIER LETTER CAPITAL BARRED B
= latin superscript capital letter barred b
1D30 MODIFIER LETTER CAPITAL D
= latin superscript capital letter d
# <super> 0044
1D31 MODIFIER LETTER CAPITAL E
= latin superscript capital letter e
# <super> 0045
1D32 MODIFIER LETTER CAPITAL REVERSED E
= latin superscript capital letter reversed e
# <super> 018E
1D33 MODIFIER LETTER CAPITAL G
= latin superscript capital letter g
# <super> 0047
1D34 MODIFIER LETTER CAPITAL H
= latin superscript capital letter h
# <super> 0048
1D35 MODIFIER LETTER CAPITAL I
= latin superscript capital letter i
# <super> 0049
1D36 MODIFIER LETTER CAPITAL J
= latin superscript capital letter j
# <super> 004A
1D37 MODIFIER LETTER CAPITAL K
= latin superscript capital letter k
# <super> 004B
1D38 MODIFIER LETTER CAPITAL L
= latin superscript capital letter l
# <super> 004C
1D39 MODIFIER LETTER CAPITAL M
= latin superscript capital letter m
# <super> 004D
1D3A MODIFIER LETTER CAPITAL N
= latin superscript capital letter n
# <super> 004E
1D3B MODIFIER LETTER CAPITAL REVERSED N
= latin superscript capital letter reversed n
1D3C MODIFIER LETTER CAPITAL O
= latin superscript capital letter o
# <super> 004F
1D3D MODIFIER LETTER CAPITAL OU
= latin superscript capital letter ou
# <super> 0222
1D3E MODIFIER LETTER CAPITAL P
= latin superscript capital letter p
# <super> 0050
1D3F MODIFIER LETTER CAPITAL R
= latin superscript capital letter r
# <super> 0052
1D40 MODIFIER LETTER CAPITAL T
= latin superscript capital letter t
# <super> 0054
1D41 MODIFIER LETTER CAPITAL U
= latin superscript capital letter u
# <super> 0055
1D42 MODIFIER LETTER CAPITAL W
= latin superscript capital letter w
# <super> 0057
1D43 MODIFIER LETTER SMALL A
= latin superscript small letter a
# <super> 0061
1D44 MODIFIER LETTER SMALL TURNED A
= latin superscript small letter turned a
# <super> 0250
1D45 MODIFIER LETTER SMALL ALPHA
= latin superscript small letter alpha
# <super> 0251
1D46 MODIFIER LETTER SMALL TURNED AE
= latin superscript small letter turned ae
# <super> 1D02
1D47 MODIFIER LETTER SMALL B
= latin superscript small letter b
# <super> 0062
1D48 MODIFIER LETTER SMALL D
= latin superscript small letter d
# <super> 0064
1D49 MODIFIER LETTER SMALL E
= latin superscript small letter e
# <super> 0065
1D4A MODIFIER LETTER SMALL SCHWA
= latin superscript small letter schwa
# <super> 0259
1D4B MODIFIER LETTER SMALL OPEN E
= latin superscript small letter open e
# <super> 025B
1D4C MODIFIER LETTER SMALL TURNED OPEN E
= latin superscript small letter turned open e
* more appropriate equivalence would be to 1D08
# <super> 025C
1D4D MODIFIER LETTER SMALL G
= latin superscript small letter g
# <super> 0067
1D4E MODIFIER LETTER SMALL TURNED I
= latin superscript small letter i
1D4F MODIFIER LETTER SMALL K
= latin superscript small letter k
# <super> 006B
1D50 MODIFIER LETTER SMALL M
= latin superscript small letter m
# <super> 006D
1D51 MODIFIER LETTER SMALL ENG
= latin superscript small letter eng
# <super> 014B
1D52 MODIFIER LETTER SMALL O
= latin superscript small letter o
# <super> 006F
1D53 MODIFIER LETTER SMALL OPEN O
= latin superscript small letter open o
# <super> 0254
1D54 MODIFIER LETTER SMALL TOP HALF O
= latin superscript small letter top half o
# <super> 1D16
1D55 MODIFIER LETTER SMALL BOTTOM HALF O
= latin superscript small letter bottom half o
# <super> 1D17
1D56 MODIFIER LETTER SMALL P
= latin superscript small letter p
# <super> 0070
1D57 MODIFIER LETTER SMALL T
= latin superscript small letter t
# <super> 0074
1D58 MODIFIER LETTER SMALL U
= latin superscript small letter u
# <super> 0075
1D59 MODIFIER LETTER SMALL SIDEWAYS U
= latin superscript small letter sideways u
# <super> 1D1D
1D5A MODIFIER LETTER SMALL TURNED M
= latin superscript small letter turned m
# <super> 026F
1D5B MODIFIER LETTER SMALL V
= latin superscript small letter v
# <super> 0076
1D5C MODIFIER LETTER SMALL AIN // (a misnomer also as it should be MODIFIER LETTER AIN; cf. 1D25 LATIN LETTER AIN, A724 LATIN CAPITAL
LETTER EGYPTOLOGICAL AIN, A725 LATIN SMALL LETTER EGYPTOLOGICAL AIN)
= latin superscript letter ain
# <super> 1D25
@ Greek superscript modifier letters
1D5D MODIFIER LETTER SMALL BETA
= greek superscript small letter beta
# <super> 03B2
1D5E MODIFIER LETTER SMALL GREEK GAMMA
= greek superscript small letter gamma
# <super> 03B3
1D5F MODIFIER LETTER SMALL DELTA // (a misnomer also as it should be MODIFIER LETTER SMALL GREEK DELTA, cf. 1E9F LATIN SMALL LETTER
DELTA)
= greek superscript small letter delta
# <super> 03B4
1D60 MODIFIER LETTER SMALL GREEK PHI
= greek superscript small letter phi
# <super> 03C6
1D61 MODIFIER LETTER SMALL CHI
= greek superscript small letter chi
# <super> 03C7
@ Latin subscript modifier letters
1D62 LATIN SUBSCRIPT SMALL LETTER I
# <sub> 0069
1D63 LATIN SUBSCRIPT SMALL LETTER R
# <sub> 0072
1D64 LATIN SUBSCRIPT SMALL LETTER U
# <sub> 0075
1D65 LATIN SUBSCRIPT SMALL LETTER V
# <sub> 0076
@ Greek subscript modifier letters
1D66 GREEK SUBSCRIPT SMALL LETTER BETA
# <sub> 03B2
1D67 GREEK SUBSCRIPT SMALL LETTER GAMMA
# <sub> 03B3
1D68 GREEK SUBSCRIPT SMALL LETTER RHO
# <sub> 03C1
1D69 GREEK SUBSCRIPT SMALL LETTER PHI
# <sub> 03C6
1D6A GREEK SUBSCRIPT SMALL LETTER CHI
# <sub> 03C7
[…]
@ Modifier letters
@+ Other modifier letters can be found in the Spacing Modifier Letters, Phonetic Extensions, as well as Superscripts and Subscripts blocks.
1D9B MODIFIER LETTER SMALL TURNED ALPHA
= latin superscript small letter turned alpha
# <super> 0252
1D9C MODIFIER LETTER SMALL C
= latin superscript small letter c
# <super> 0063
1D9D MODIFIER LETTER SMALL C WITH CURL
= latin superscript small letter c with curl
# <super> 0255
1D9E MODIFIER LETTER SMALL ETH
= latin superscript small letter eth
# <super> 00F0
1D9F MODIFIER LETTER SMALL REVERSED OPEN E
= latin superscript small letter reversed open e
# <super> 025C
1DA0 MODIFIER LETTER SMALL F
= latin superscript small letter f
# <super> 0066
1DA1 MODIFIER LETTER SMALL DOTLESS J WITH STROKE
= latin superscript small letter dotless j with stroke
# <super> 025F
1DA2 MODIFIER LETTER SMALL SCRIPT G
= latin superscript small letter script g
# <super> 0261
1DA3 MODIFIER LETTER SMALL TURNED H
= latin superscript small letter turned h
# <super> 0265
1DA4 MODIFIER LETTER SMALL I WITH STROKE
= latin superscript small letter i with stroke
# <super> 0268
1DA5 MODIFIER LETTER SMALL IOTA
= latin superscript small letter iota
# <super> 0269
1DA6 MODIFIER LETTER SMALL CAPITAL I
= latin letter small capital i
* not for use in UPA
x (modifier letter capital i - 1D35)
# <super> 026A
1DA7 MODIFIER LETTER SMALL CAPITAL I WITH STROKE
= latin letter small capital i with stroke
# <super> 1D7B
1DA8 MODIFIER LETTER SMALL J WITH CROSSED-TAIL
= latin superscript small letter j with crossed-tail
# <super> 029D
1DA9 MODIFIER LETTER SMALL L WITH RETROFLEX HOOK
= latin superscript small letter l with retroflex hook
# <super> 026D
1DAA MODIFIER LETTER SMALL L WITH PALATAL HOOK
= latin superscript small letter l with palatal hook
# <super> 1D85
1DAB MODIFIER LETTER SMALL CAPITAL L
= latin letter small capital l
* not for use in UPA
x (modifier letter capital l - 1D38)
# <super> 029F
1DAC MODIFIER LETTER SMALL M WITH HOOK
= latin superscript small letter m with hook
# <super> 0271
1DAD MODIFIER LETTER SMALL TURNED M WITH LONG LEG
= latin superscript small letter turned m with long leg
# <super> 0270
1DAE MODIFIER LETTER SMALL N WITH LEFT HOOK
= latin superscript small letter n with left hook
# <super> 0272
1DAF MODIFIER LETTER SMALL N WITH RETROFLEX HOOK
= latin superscript small letter n with retroflex hook
# <super> 0273
1DB0 MODIFIER LETTER SMALL CAPITAL N
= latin letter small capital n
* not for use in UPA
x (modifier letter capital n - 1D3A)
# <super> 0274
1DB1 MODIFIER LETTER SMALL BARRED O
= latin superscript small letter barred o
# <super> 0275
1DB2 MODIFIER LETTER SMALL PHI
= latin superscript small letter phi
# <super> 0278
1DB3 MODIFIER LETTER SMALL S WITH HOOK
= latin superscript small letter s with hook
# <super> 0282
1DB4 MODIFIER LETTER SMALL ESH
= latin superscript small letter esh
# <super> 0283
1DB5 MODIFIER LETTER SMALL T WITH PALATAL HOOK
= latin superscript small letter small t with palatal hook
# <super> 01AB
1DB6 MODIFIER LETTER SMALL U BAR
= latin superscript small letter u bar
# <super> 0289
1DB7 MODIFIER LETTER SMALL UPSILON
= latin superscript small letter upsilon
# <super> 028A
1DB8 MODIFIER LETTER SMALL CAPITAL U
= latin letter small capital u
* not for use in UPA
x (modifier letter capital u - 1D41)
# <super> 1D1C
1DB9 MODIFIER LETTER SMALL V WITH HOOK
= latin superscript small letter v with hook
# <super> 028B
1DBA MODIFIER LETTER SMALL TURNED V
= latin superscript small letter turned v
# <super> 028C
1DBB MODIFIER LETTER SMALL Z
= latin superscript small letter z
# <super> 007A
1DBC MODIFIER LETTER SMALL Z WITH RETROFLEX HOOK
= latin superscript small letter z with retroflex hook
# <super> 0290
1DBD MODIFIER LETTER SMALL Z WITH CURL
= latin superscript small letter z with curl
# <super> 0291
1DBE MODIFIER LETTER SMALL EZH
= latin superscript small letter ezh
# <super> 0292
1DBF MODIFIER LETTER SMALL THETA
= latin superscript small letter theta
# <super> 03B8
[…]
@ Additions for Extended IPA
A7F8 MODIFIER LETTER CAPITAL H WITH STROKE
= latin superscript capital letter h with stroke
* faucalized
# <super> 0126
A7F9 MODIFIER LETTER SMALL LIGATURE OE
= latin superscript small ligature oe
* labialized: open-rounded
# <super> 0153
[…]
@ Modifier letters for German dialectology
AB5B MODIFIER BREVE WITH INVERTED BREVE
x (breve - 02D8)
x (close up - 2050)
x (metrical breve - 23D1)
AB5C MODIFIER LETTER SMALL HENG
= latin superscript small letter heng
# <super> A727
AB5D MODIFIER LETTER SMALL L WITH INVERTED LAZY S
= latin superscript small letter l with inverted lazy s
# <super> AB37
AB5E MODIFIER LETTER SMALL L WITH MIDDLE TILDE
= latin superscript small letter l with middle tilde
# <super> 026B
AB5F MODIFIER LETTER SMALL U WITH LEFT HOOK
= latin superscript small letter u with left hook
# <super> AB52
Received on Tue Jan 17 2017 - 02:26:22 CST
This archive was generated by hypermail 2.2.0 : Tue Jan 17 2017 - 02:26:23 CST