L2/06-242

Subject: Distinguishing Sk, Lm, Mc
From: Mark Davis
Date: 2007-07-25

We have two kinds of general category property values called modifiers: Sk (Symbol, Modifier) and Lm (Letter, Modifier). We also have Mc (Mark, Spacing Combining) which are spacing marks that modify letters. In addition, we have characters with "MODIFIER LETTER" in the name.

Unfortunately, there is no alignment between these, and people get confused. Of the characters with MODIFIER LETTER in the name, 63 are Sk, and 134 are Lm, while of those without MODIFIER LETTER in there names, there are 36 Sk and 33 gc=Lm. At least we got one thing right: no Mc characters have "MODIFIER LETTER" in their names! In the discussion of a UTC document (02-267), the relevant distinction was given by Ken as that the Lm's were used as parts of words and identifiers, while the Sk's were not. Although that reason was not captured in the minutes, it was in my notes and in a modification history note in one of the UAXs.

We should:

  1. Document the reasons that we distinguish Sk, Lm, Mc in UCD.html for U5+ (the next version after 5.0) so that people understand what we mean by one versus the other, and when they should use one versus the other.
  2. Review the assignments listed at the end of this document to ensure that we follow the descriptions in our assignements
  3. Document that many MODIFIER LETTERs are not actually modifier letters (in places where we document that names are misleading)

Documents

Breakdown

In [$gc:Sk], but not in [$Name:«.*MODIFIER LETTER.*»] :

005E # Sk (^) CIRCUMFLEX ACCENT
0060 # Sk (`) GRAVE ACCENT
00A8 # Sk (¨) DIAERESIS
00AF # Sk (¯) MACRON
00B4 # Sk (´) ACUTE ACCENT
00B8 # Sk (¸) CEDILLA
02D8 # Sk (˘) BREVE
02D9 # Sk (˙) DOT ABOVE
02DA # Sk (˚) RING ABOVE
02DB # Sk (˛) OGONEK
02DC # Sk (˜) SMALL TILDE
02DD # Sk (˝) DOUBLE ACUTE ACCENT
0374 # Sk (ʹ) GREEK NUMERAL SIGN
0375 # Sk (͵) GREEK LOWER NUMERAL SIGN
0384 # Sk (΄) GREEK TONOS
0385 # Sk (΅) GREEK DIALYTIKA TONOS
1FBD # Sk (᾽) GREEK KORONIS
1FBF # Sk (᾿) GREEK PSILI
1FC0 # Sk (῀) GREEK PERISPOMENI
1FC1 # Sk (῁) GREEK DIALYTIKA AND PERISPOMENI
1FCD # Sk (῍) GREEK PSILI AND VARIA
1FCE # Sk (῎) GREEK PSILI AND OXIA
1FCF # Sk (῏) GREEK PSILI AND PERISPOMENI
1FDD # Sk (῝) GREEK DASIA AND VARIA
1FDE # Sk (῞) GREEK DASIA AND OXIA
1FDF # Sk (῟) GREEK DASIA AND PERISPOMENI
1FED # Sk (῭) GREEK DIALYTIKA AND VARIA
1FEE # Sk (΅) GREEK DIALYTIKA AND OXIA
1FEF # Sk (`) GREEK VARIA
1FFD # Sk (´) GREEK OXIA
1FFE # Sk (῾) GREEK DASIA
309B # Sk (゛) KATAKANA-HIRAGANA VOICED SOUND MARK
309C # Sk (゜) KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK
FF3E # Sk (^) FULLWIDTH CIRCUMFLEX ACCENT
FF40 # Sk (`) FULLWIDTH GRAVE ACCENT
FFE3 # Sk ( ̄) FULLWIDTH MACRON

# Total code points: 36

In both [$gc:Sk], and in [$Name:«.*MODIFIER LETTER.*»] :

02C2 # Sk (˂) MODIFIER LETTER LEFT ARROWHEAD
02C3 # Sk (˃) MODIFIER LETTER RIGHT ARROWHEAD
02C4 # Sk (˄) MODIFIER LETTER UP ARROWHEAD
02C5 # Sk (˅) MODIFIER LETTER DOWN ARROWHEAD
02D2 # Sk (˒) MODIFIER LETTER CENTRED RIGHT HALF RING
02D3 # Sk (˓) MODIFIER LETTER CENTRED LEFT HALF RING
02D4 # Sk (˔) MODIFIER LETTER UP TACK
02D5 # Sk (˕) MODIFIER LETTER DOWN TACK
02D6 # Sk (˖) MODIFIER LETTER PLUS SIGN
02D7 # Sk (˗) MODIFIER LETTER MINUS SIGN
02DE # Sk (˞) MODIFIER LETTER RHOTIC HOOK
02DF # Sk (˟) MODIFIER LETTER CROSS ACCENT
02E5 # Sk (˥) MODIFIER LETTER EXTRA-HIGH TONE BAR
02E6 # Sk (˦) MODIFIER LETTER HIGH TONE BAR
02E7 # Sk (˧) MODIFIER LETTER MID TONE BAR
02E8 # Sk (˨) MODIFIER LETTER LOW TONE BAR
02E9 # Sk (˩) MODIFIER LETTER EXTRA-LOW TONE BAR
02EA # Sk (˪) MODIFIER LETTER YIN DEPARTING TONE MARK
02EB # Sk (˫) MODIFIER LETTER YANG DEPARTING TONE MARK
02EC # Sk (ˬ) MODIFIER LETTER VOICING
02ED # Sk (˭) MODIFIER LETTER UNASPIRATED
02EF # Sk (˯) MODIFIER LETTER LOW DOWN ARROWHEAD
02F0 # Sk (˰) MODIFIER LETTER LOW UP ARROWHEAD
02F1 # Sk (˱) MODIFIER LETTER LOW LEFT ARROWHEAD
02F2 # Sk (˲) MODIFIER LETTER LOW RIGHT ARROWHEAD
02F3 # Sk (˳) MODIFIER LETTER LOW RING
02F4 # Sk (˴) MODIFIER LETTER MIDDLE GRAVE ACCENT
02F5 # Sk (˵) MODIFIER LETTER MIDDLE DOUBLE GRAVE ACCENT
02F6 # Sk (˶) MODIFIER LETTER MIDDLE DOUBLE ACUTE ACCENT
02F7 # Sk (˷) MODIFIER LETTER LOW TILDE
02F8 # Sk (˸) MODIFIER LETTER RAISED COLON
02F9 # Sk (˹) MODIFIER LETTER BEGIN HIGH TONE
02FA # Sk (˺) MODIFIER LETTER END HIGH TONE
02FB # Sk (˻) MODIFIER LETTER BEGIN LOW TONE
02FC # Sk (˼) MODIFIER LETTER END LOW TONE
02FD # Sk (˽) MODIFIER LETTER SHELF
02FE # Sk (˾) MODIFIER LETTER OPEN SHELF
02FF # Sk (˿) MODIFIER LETTER LOW LEFT ARROW
A700 # Sk (꜀) MODIFIER LETTER CHINESE TONE YIN PING
A701 # Sk (꜁) MODIFIER LETTER CHINESE TONE YANG PING
A702 # Sk (꜂) MODIFIER LETTER CHINESE TONE YIN SHANG
A703 # Sk (꜃) MODIFIER LETTER CHINESE TONE YANG SHANG
A704 # Sk (꜄) MODIFIER LETTER CHINESE TONE YIN QU
A705 # Sk (꜅) MODIFIER LETTER CHINESE TONE YANG QU
A706 # Sk (꜆) MODIFIER LETTER CHINESE TONE YIN RU
A707 # Sk (꜇) MODIFIER LETTER CHINESE TONE YANG RU
A708 # Sk (꜈) MODIFIER LETTER EXTRA-HIGH DOTTED TONE BAR
A709 # Sk (꜉) MODIFIER LETTER HIGH DOTTED TONE BAR
A70A # Sk (꜊) MODIFIER LETTER MID DOTTED TONE BAR
A70B # Sk (꜋) MODIFIER LETTER LOW DOTTED TONE BAR
A70C # Sk (꜌) MODIFIER LETTER EXTRA-LOW DOTTED TONE BAR
A70D # Sk (꜍) MODIFIER LETTER EXTRA-HIGH DOTTED LEFT-STEM TONE BAR
A70E # Sk (꜎) MODIFIER LETTER HIGH DOTTED LEFT-STEM TONE BAR
A70F # Sk (꜏) MODIFIER LETTER MID DOTTED LEFT-STEM TONE BAR
A710 # Sk (꜐) MODIFIER LETTER LOW DOTTED LEFT-STEM TONE BAR
A711 # Sk (꜑) MODIFIER LETTER EXTRA-LOW DOTTED LEFT-STEM TONE BAR
A712 # Sk (꜒) MODIFIER LETTER EXTRA-HIGH LEFT-STEM TONE BAR
A713 # Sk (꜓) MODIFIER LETTER HIGH LEFT-STEM TONE BAR
A714 # Sk (꜔) MODIFIER LETTER MID LEFT-STEM TONE BAR
A715 # Sk (꜕) MODIFIER LETTER LOW LEFT-STEM TONE BAR
A716 # Sk (꜖) MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR
A720 # Sk (꜠) MODIFIER LETTER STRESS AND HIGH TONE
A721 # Sk (꜡) MODIFIER LETTER STRESS AND LOW TONE

# Total code points: 63

In [$gc:Lm], but not in [$Name:«.*MODIFIER LETTER.*»] :

02C7 # Lm (ˇ) CARON
037A # Lm (ͺ) GREEK YPOGEGRAMMENI
0640 # Lm (ـ) ARABIC TATWEEL
06E5 # Lm (ۥ) ARABIC SMALL WAW
06E6 # Lm (ۦ) ARABIC SMALL YEH
07F4 # Lm (ߴ) NKO HIGH TONE APOSTROPHE
07F5 # Lm (ߵ) NKO LOW TONE APOSTROPHE
07FA # Lm (ߺ) NKO LAJANYALAN
0E46 # Lm (ๆ) THAI CHARACTER MAIYAMOK
0EC6 # Lm (ໆ) LAO KO LA
17D7 # Lm (ៗ) KHMER SIGN LEK TOO
1843 # Lm (ᡃ) MONGOLIAN LETTER TODO LONG VOWEL SIGN
2090 # Lm (ₐ) LATIN SUBSCRIPT SMALL LETTER A
2091 # Lm (ₑ) LATIN SUBSCRIPT SMALL LETTER E
2092 # Lm (ₒ) LATIN SUBSCRIPT SMALL LETTER O
2093 # Lm (ₓ) LATIN SUBSCRIPT SMALL LETTER X
2094 # Lm (ₔ) LATIN SUBSCRIPT SMALL LETTER SCHWA
3005 # Lm (々) IDEOGRAPHIC ITERATION MARK
3031 # Lm (〱) VERTICAL KANA REPEAT MARK
3032 # Lm (〲) VERTICAL KANA REPEAT WITH VOICED SOUND MARK
3033 # Lm (〳) VERTICAL KANA REPEAT MARK UPPER HALF
3034 # Lm (〴) VERTICAL KANA REPEAT WITH VOICED SOUND MARK UPPER HALF
3035 # Lm (〵) VERTICAL KANA REPEAT MARK LOWER HALF
303B # Lm (〻) VERTICAL IDEOGRAPHIC ITERATION MARK
309D # Lm (ゝ) HIRAGANA ITERATION MARK
309E # Lm (ゞ) HIRAGANA VOICED ITERATION MARK
30FC # Lm (ー) KATAKANA-HIRAGANA PROLONGED SOUND MARK
30FD # Lm (ヽ) KATAKANA ITERATION MARK
30FE # Lm (ヾ) KATAKANA VOICED ITERATION MARK
A015 # Lm (ꀕ) YI SYLLABLE WU
FF70 # Lm (ー) HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK
FF9E # Lm (゙) HALFWIDTH KATAKANA VOICED SOUND MARK
FF9F # Lm (゚) HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK

# Total code points: 33

In both [$gc:Lm], and in [$Name:«.*MODIFIER LETTER.*»] :

02B0 # Lm (ʰ) MODIFIER LETTER SMALL H
02B1 # Lm (ʱ) MODIFIER LETTER SMALL H WITH HOOK
02B2 # Lm (ʲ) MODIFIER LETTER SMALL J
02B3 # Lm (ʳ) MODIFIER LETTER SMALL R
02B4 # Lm (ʴ) MODIFIER LETTER SMALL TURNED R
02B5 # Lm (ʵ) MODIFIER LETTER SMALL TURNED R WITH HOOK
02B6 # Lm (ʶ) MODIFIER LETTER SMALL CAPITAL INVERTED R
02B7 # Lm (ʷ) MODIFIER LETTER SMALL W
02B8 # Lm (ʸ) MODIFIER LETTER SMALL Y
02B9 # Lm (ʹ) MODIFIER LETTER PRIME
02BA # Lm (ʺ) MODIFIER LETTER DOUBLE PRIME
02BB # Lm (ʻ) MODIFIER LETTER TURNED COMMA
02BC # Lm (ʼ) MODIFIER LETTER APOSTROPHE
02BD # Lm (ʽ) MODIFIER LETTER REVERSED COMMA
02BE # Lm (ʾ) MODIFIER LETTER RIGHT HALF RING
02BF # Lm (ʿ) MODIFIER LETTER LEFT HALF RING
02C0 # Lm (ˀ) MODIFIER LETTER GLOTTAL STOP
02C1 # Lm (ˁ) MODIFIER LETTER REVERSED GLOTTAL STOP
02C6 # Lm (ˆ) MODIFIER LETTER CIRCUMFLEX ACCENT
02C8 # Lm (ˈ) MODIFIER LETTER VERTICAL LINE
02C9 # Lm (ˉ) MODIFIER LETTER MACRON
02CA # Lm (ˊ) MODIFIER LETTER ACUTE ACCENT
02CB # Lm (ˋ) MODIFIER LETTER GRAVE ACCENT
02CC # Lm (ˌ) MODIFIER LETTER LOW VERTICAL LINE
02CD # Lm (ˍ) MODIFIER LETTER LOW MACRON
02CE # Lm (ˎ) MODIFIER LETTER LOW GRAVE ACCENT
02CF # Lm (ˏ) MODIFIER LETTER LOW ACUTE ACCENT
02D0 # Lm (ː) MODIFIER LETTER TRIANGULAR COLON
02D1 # Lm (ˑ) MODIFIER LETTER HALF TRIANGULAR COLON
02E0 # Lm (ˠ) MODIFIER LETTER SMALL GAMMA
02E1 # Lm (ˡ) MODIFIER LETTER SMALL L
02E2 # Lm (ˢ) MODIFIER LETTER SMALL S
02E3 # Lm (ˣ) MODIFIER LETTER SMALL X
02E4 # Lm (ˤ) MODIFIER LETTER SMALL REVERSED GLOTTAL STOP
02EE # Lm (ˮ) MODIFIER LETTER DOUBLE APOSTROPHE
0559 # Lm (ՙ) ARMENIAN MODIFIER LETTER LEFT HALF RING
10FC # Lm (ჼ) MODIFIER LETTER GEORGIAN NAR
1D2C # Lm (ᴬ) MODIFIER LETTER CAPITAL A
1D2D # Lm (ᴭ) MODIFIER LETTER CAPITAL AE
1D2E # Lm (ᴮ) MODIFIER LETTER CAPITAL B
1D2F # Lm (ᴯ) MODIFIER LETTER CAPITAL BARRED B
1D30 # Lm (ᴰ) MODIFIER LETTER CAPITAL D
1D31 # Lm (ᴱ) MODIFIER LETTER CAPITAL E
1D32 # Lm (ᴲ) MODIFIER LETTER CAPITAL REVERSED E
1D33 # Lm (ᴳ) MODIFIER LETTER CAPITAL G
1D34 # Lm (ᴴ) MODIFIER LETTER CAPITAL H
1D35 # Lm (ᴵ) MODIFIER LETTER CAPITAL I
1D36 # Lm (ᴶ) MODIFIER LETTER CAPITAL J
1D37 # Lm (ᴷ) MODIFIER LETTER CAPITAL K
1D38 # Lm (ᴸ) MODIFIER LETTER CAPITAL L
1D39 # Lm (ᴹ) MODIFIER LETTER CAPITAL M
1D3A # Lm (ᴺ) MODIFIER LETTER CAPITAL N
1D3B # Lm (ᴻ) MODIFIER LETTER CAPITAL REVERSED N
1D3C # Lm (ᴼ) MODIFIER LETTER CAPITAL O
1D3D # Lm (ᴽ) MODIFIER LETTER CAPITAL OU
1D3E # Lm (ᴾ) MODIFIER LETTER CAPITAL P
1D3F # Lm (ᴿ) MODIFIER LETTER CAPITAL R
1D40 # Lm (ᵀ) MODIFIER LETTER CAPITAL T
1D41 # Lm (ᵁ) MODIFIER LETTER CAPITAL U
1D42 # Lm (ᵂ) MODIFIER LETTER CAPITAL W
1D43 # Lm (ᵃ) MODIFIER LETTER SMALL A
1D44 # Lm (ᵄ) MODIFIER LETTER SMALL TURNED A
1D45 # Lm (ᵅ) MODIFIER LETTER SMALL ALPHA
1D46 # Lm (ᵆ) MODIFIER LETTER SMALL TURNED AE
1D47 # Lm (ᵇ) MODIFIER LETTER SMALL B
1D48 # Lm (ᵈ) MODIFIER LETTER SMALL D
1D49 # Lm (ᵉ) MODIFIER LETTER SMALL E
1D4A # Lm (ᵊ) MODIFIER LETTER SMALL SCHWA
1D4B # Lm (ᵋ) MODIFIER LETTER SMALL OPEN E
1D4C # Lm (ᵌ) MODIFIER LETTER SMALL TURNED OPEN E
1D4D # Lm (ᵍ) MODIFIER LETTER SMALL G
1D4E # Lm (ᵎ) MODIFIER LETTER SMALL TURNED I
1D4F # Lm (ᵏ) MODIFIER LETTER SMALL K
1D50 # Lm (ᵐ) MODIFIER LETTER SMALL M
1D51 # Lm (ᵑ) MODIFIER LETTER SMALL ENG
1D52 # Lm (ᵒ) MODIFIER LETTER SMALL O
1D53 # Lm (ᵓ) MODIFIER LETTER SMALL OPEN O
1D54 # Lm (ᵔ) MODIFIER LETTER SMALL TOP HALF O
1D55 # Lm (ᵕ) MODIFIER LETTER SMALL BOTTOM HALF O
1D56 # Lm (ᵖ) MODIFIER LETTER SMALL P
1D57 # Lm (ᵗ) MODIFIER LETTER SMALL T
1D58 # Lm (ᵘ) MODIFIER LETTER SMALL U
1D59 # Lm (ᵙ) MODIFIER LETTER SMALL SIDEWAYS U
1D5A # Lm (ᵚ) MODIFIER LETTER SMALL TURNED M
1D5B # Lm (ᵛ) MODIFIER LETTER SMALL V
1D5C # Lm (ᵜ) MODIFIER LETTER SMALL AIN
1D5D # Lm (ᵝ) MODIFIER LETTER SMALL BETA
1D5E # Lm (ᵞ) MODIFIER LETTER SMALL GREEK GAMMA
1D5F # Lm (ᵟ) MODIFIER LETTER SMALL DELTA
1D60 # Lm (ᵠ) MODIFIER LETTER SMALL GREEK PHI
1D61 # Lm (ᵡ) MODIFIER LETTER SMALL CHI
1D78 # Lm (ᵸ) MODIFIER LETTER CYRILLIC EN
1D9B # Lm (ᶛ) MODIFIER LETTER SMALL TURNED ALPHA
1D9C # Lm (ᶜ) MODIFIER LETTER SMALL C
1D9D # Lm (ᶝ) MODIFIER LETTER SMALL C WITH CURL
1D9E # Lm (ᶞ) MODIFIER LETTER SMALL ETH
1D9F # Lm (ᶟ) MODIFIER LETTER SMALL REVERSED OPEN E
1DA0 # Lm (ᶠ) MODIFIER LETTER SMALL F
1DA1 # Lm (ᶡ) MODIFIER LETTER SMALL DOTLESS J WITH STROKE
1DA2 # Lm (ᶢ) MODIFIER LETTER SMALL SCRIPT G
1DA3 # Lm (ᶣ) MODIFIER LETTER SMALL TURNED H
1DA4 # Lm (ᶤ) MODIFIER LETTER SMALL I WITH STROKE
1DA5 # Lm (ᶥ) MODIFIER LETTER SMALL IOTA
1DA6 # Lm (ᶦ) MODIFIER LETTER SMALL CAPITAL I
1DA7 # Lm (ᶧ) MODIFIER LETTER SMALL CAPITAL I WITH STROKE
1DA8 # Lm (ᶨ) MODIFIER LETTER SMALL J WITH CROSSED-TAIL
1DA9 # Lm (ᶩ) MODIFIER LETTER SMALL L WITH RETROFLEX HOOK
1DAA # Lm (ᶪ) MODIFIER LETTER SMALL L WITH PALATAL HOOK
1DAB # Lm (ᶫ) MODIFIER LETTER SMALL CAPITAL L
1DAC # Lm (ᶬ) MODIFIER LETTER SMALL M WITH HOOK
1DAD # Lm (ᶭ) MODIFIER LETTER SMALL TURNED M WITH LONG LEG
1DAE # Lm (ᶮ) MODIFIER LETTER SMALL N WITH LEFT HOOK
1DAF # Lm (ᶯ) MODIFIER LETTER SMALL N WITH RETROFLEX HOOK
1DB0 # Lm (ᶰ) MODIFIER LETTER SMALL CAPITAL N
1DB1 # Lm (ᶱ) MODIFIER LETTER SMALL BARRED O
1DB2 # Lm (ᶲ) MODIFIER LETTER SMALL PHI
1DB3 # Lm (ᶳ) MODIFIER LETTER SMALL S WITH HOOK
1DB4 # Lm (ᶴ) MODIFIER LETTER SMALL ESH
1DB5 # Lm (ᶵ) MODIFIER LETTER SMALL T WITH PALATAL HOOK
1DB6 # Lm (ᶶ) MODIFIER LETTER SMALL U BAR
1DB7 # Lm (ᶷ) MODIFIER LETTER SMALL UPSILON
1DB8 # Lm (ᶸ) MODIFIER LETTER SMALL CAPITAL U
1DB9 # Lm (ᶹ) MODIFIER LETTER SMALL V WITH HOOK
1DBA # Lm (ᶺ) MODIFIER LETTER SMALL TURNED V
1DBB # Lm (ᶻ) MODIFIER LETTER SMALL Z
1DBC # Lm (ᶼ) MODIFIER LETTER SMALL Z WITH RETROFLEX HOOK
1DBD # Lm (ᶽ) MODIFIER LETTER SMALL Z WITH CURL
1DBE # Lm (ᶾ) MODIFIER LETTER SMALL EZH
1DBF # Lm (ᶿ) MODIFIER LETTER SMALL THETA
2D6F # Lm (ⵯ) TIFINAGH MODIFIER LETTER LABIALIZATION MARK
A717 # Lm (ꜗ) MODIFIER LETTER DOT VERTICAL BAR
A718 # Lm (ꜘ) MODIFIER LETTER DOT SLASH
A719 # Lm (ꜙ) MODIFIER LETTER DOT HORIZONTAL BAR
A71A # Lm (ꜚ) MODIFIER LETTER LOWER RIGHT CORNER ANGLE

# Total code points: 134