L2/13-117R2

From:        Mark Davis

To:        UTC

Date:        May 7, 2013

Re:        PRI 251 Additions to Uppercase

Live:        http://goo.gl/MGQ7e 

Based on the feedback and discussion of PRI 251 (http://www.unicode.org/review/pri251/), I propose that we make the characters in Set A below be Uppercase=Yes, consistent with characters like U+24B6 CIRCLED LATIN CAPITAL LETTER A, but that we leave Set B/C/D below as Uppercase=No. As usual, this would be done by adding to Other_Uppercase.

Set A

1F130-1F149 : SQUARED LATIN CAPITAL LETTER...

1F150-1F169 : NEGATIVE CIRCLED LATIN CAPITAL LETTER...

1F170-1F189 : NEGATIVE SQUARED LATIN CAPITAL LETTER...

Set B

1F110-1F129 : PARENTHESIZED LATIN CAPITAL LETTER...

Set C

1F12A : TORTOISE SHELL BRACKETED LATIN CAPITAL LETTER S

1F12B-C : CIRCLED ITALIC LATIN CAPITAL LETTER ...

1F18A: CROSSED NEGATIVE SQUARED LATIN CAPITAL LETTER P

Set D

U+1F12D ( 🄭 ) CIRCLED CD

U+1F14A ( 🅊 ) SQUARED HV

U+1F14B ( 🅋 ) SQUARED MV

U+1F14C ( 🅌 ) SQUARED SD

U+1F14D ( 🅍 ) SQUARED SS

U+1F14E ( 🅎 ) SQUARED PPV

U+1F14F ( 🅏 ) SQUARED WC

U+1F18B ( 🆋 ) NEGATIVE SQUARED IC

U+1F18C ( 🆌 ) NEGATIVE SQUARED PA

U+1F18D ( 🆍 ) NEGATIVE SQUARED SA

U+1F18E ( 🆎 ) NEGATIVE SQUARED AB

U+1F18F ( 🆏 ) NEGATIVE SQUARED WC

U+1F190 ( 🆐 ) SQUARE DJ

U+1F191 ( 🆑 ) SQUARED CL

U+1F192 ( 🆒 ) SQUARED COOL

U+1F193 ( 🆓 ) SQUARED FREE

U+1F194 ( 🆔 ) SQUARED ID

U+1F195 ( 🆕 ) SQUARED NEW

U+1F196 ( 🆖 ) SQUARED NG

U+1F197 ( 🆗 ) SQUARED OK

U+1F198 ( 🆘 ) SQUARED SOS

U+1F199 ( 🆙 ) SQUARED UP WITH EXCLAMATION MARK

U+1F19A ( 🆚 ) SQUARED VS


Background

Unfortunately, we didn’t give these characters NFKD decompositions, as we did with similar previously-encoded characters — although we do sort them as if they had such decompositions.

  1. Set A characters are quite clear; anyone would expect them to behave like U+24B6 CIRCLED LATIN CAPITAL LETTER A.
  2. Set B is analogous to the existing characters U+249C ( ⒜ ) PARENTHESIZED LATIN SMALL LETTER A, and thus should not be Uppercase.
  3. Set C is debatable, but given that they are not complete alphabets, the UTC doesn’t want to change them.
  4. Set D is debatable also. However, for consistency with previously encoded Unicode characters that logically consist of a sequence of decorated characters (like U+3373 ( ㍳ ) SQUARE AU)  it is probably best for them to be neither upper nor lowercase.