Mark Davis, 2005-01-19
As has been clear in successive drafts of UAX #31: Identifier and Pattern Syntax, the Pattern_White_Space characters and Pattern_Syntax Characters will be immutable once they appear in Unicode 4.1.0. The description of the derivation of these two is contained in successive draft versions of #31.
I've had a request to see the current contents of the properties contrasted with their starting points and other information, as of the current state of the UCD for 4.1.0. I generated this for information only; there is no proposal for changes.
The lists below can change in future versions of Unicode, since the Pattern_White_Space characters and Pattern_Syntax Characters will be immutable, while other properties may change. Both properties were originally formed by starting with a certain set of characters based on properties and ranges, and removing compatibility decomposibles, modifiers, and script-specific characters. The ranges for the syntax characters were those that the UTC had originally supplied to the W3C as reserved for future symbols and/or punctuation. On the basis of progressive discussions in the UTC meetings, a few other changes were made on top of that.
The contents of the two properties are in http://unicode.org/Public/4.1.0/ucd/PropList-4.1.0d9.txt (or a later d version).
Comparisons
1. There is a certain overlap between Pattern_Whitespace and in ID_Continue, in connector punctuation.
005F # Pc LOW LINE 203F..2040 # Pc [2] UNDERTIE..CHARACTER TIE 2054 # Pc INVERTED UNDERTIE
2. There is a certain overlap between Alphabetic and Pattern_Whitespace, in the characters we recently added to Alphabetic
24B6..24E9 # So [52] CIRCLED LATIN CAPITAL LETTER A..CIRCLED LATIN SMALL LETTER Z
3. Pattern_Whitespace = (Whitespace ∪ RLM ∪ LRM) minus the following (as of 4.1.0)
00A0 # Zs NO-BREAK SPACE 1680 # Zs OGHAM SPACE MARK 180E # Zs MONGOLIAN VOWEL SEPARATOR 2000..200A # Zs [11] EN QUAD..HAIR SPACE 202F # Zs NARROW NO-BREAK SPACE 205F # Zs MEDIUM MATHEMATICAL SPACE 3000 # Zs IDEOGRAPHIC SPACE # Total code points: 17
4. In both Pattern_Whitespace and (Whitespace ∪ RLM ∪ LRM) are:
0009..000D # Cc [5] <control-0009>..<control-000D> 0020 # Zs SPACE 0085 # Cc <control-0085> 200E..200F # Cf [2] LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK 2028 # Zl LINE SEPARATOR 2029 # Zp PARAGRAPH SEPARATOR
# Total code points: 9
5. Pattern_Syntax = (GC:Symbol ∪ GC:Punctuation ∪ U+2190..U+2BFF ∪ U+2E00..U+2E7F) minus the following (as of 4.1.0)
00A8 # Sk DIAERESIS 00AF # Sk MACRON 00B4 # Sk ACUTE ACCENT 00B8 # Sk CEDILLA 02C2..02C5 # Sk [4] MODIFIER LETTER LEFT ARROWHEAD..MODIFIER LETTER DOWN ARROWHEAD 02D2..02DF # Sk [14] MODIFIER LETTER CENTRED RIGHT HALF RING..MODIFIER LETTER CROSS ACCENT 02E5..02ED # Sk [9] MODIFIER LETTER EXTRA-HIGH TONE BAR..MODIFIER LETTER UNASPIRATED 02EF..02FF # Sk [17] MODIFIER LETTER LOW DOWN ARROWHEAD..MODIFIER LETTER LOW LEFT ARROW 0374..0375 # Sk [2] GREEK NUMERAL SIGN..GREEK LOWER NUMERAL SIGN 037E # Po GREEK QUESTION MARK 0384..0385 # Sk [2] GREEK TONOS..GREEK DIALYTIKA TONOS 0387 # Po GREEK ANO TELEIA 03F6 # Sm GREEK REVERSED LUNATE EPSILON SYMBOL 0482 # So CYRILLIC THOUSANDS SIGN 055A..055F # Po [6] ARMENIAN APOSTROPHE..ARMENIAN ABBREVIATION MARK 0589 # Po ARMENIAN FULL STOP 058A # Pd ARMENIAN HYPHEN 05BE # Po HEBREW PUNCTUATION MAQAF 05C0 # Po HEBREW PUNCTUATION PASEQ 05C3 # Po HEBREW PUNCTUATION SOF PASUQ 05C6 # Po HEBREW PUNCTUATION NUN HAFUKHA 05F3..05F4 # Po [2] HEBREW PUNCTUATION GERESH..HEBREW PUNCTUATION GERSHAYIM 060B # Sc AFGHANI SIGN 060C..060D # Po [2] ARABIC COMMA..ARABIC DATE SEPARATOR 060E..060F # So [2] ARABIC POETIC VERSE SIGN..ARABIC SIGN MISRA 061B # Po ARABIC SEMICOLON 061E..061F # Po [2] ARABIC TRIPLE DOT PUNCTUATION MARK..ARABIC QUESTION MARK 066A..066D # Po [4] ARABIC PERCENT SIGN..ARABIC FIVE POINTED STAR 06D4 # Po ARABIC FULL STOP 06E9 # So ARABIC PLACE OF SAJDAH 06FD..06FE # So [2] ARABIC SIGN SINDHI AMPERSAND..ARABIC SIGN SINDHI POSTPOSITION MEN 0700..070D # Po [14] SYRIAC END OF PARAGRAPH..SYRIAC HARKLEAN ASTERISCUS 0964..0965 # Po [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA 0970 # Po DEVANAGARI ABBREVIATION SIGN 09F2..09F3 # Sc [2] BENGALI RUPEE MARK..BENGALI RUPEE SIGN 09FA # So BENGALI ISSHAR 0AF1 # Sc GUJARATI RUPEE SIGN 0B70 # So ORIYA ISSHAR 0BF3..0BF8 # So [6] TAMIL DAY SIGN..TAMIL AS ABOVE SIGN 0BF9 # Sc TAMIL RUPEE SIGN 0BFA # So TAMIL NUMBER SIGN 0DF4 # Po SINHALA PUNCTUATION KUNDDALIYA 0E3F # Sc THAI CURRENCY SYMBOL BAHT 0E4F # Po THAI CHARACTER FONGMAN 0E5A..0E5B # Po [2] THAI CHARACTER ANGKHANKHU..THAI CHARACTER KHOMUT 0F01..0F03 # So [3] TIBETAN MARK GTER YIG MGO TRUNCATED A..TIBETAN MARK GTER YIG MGO -UM GTER TSHEG MA 0F04..0F12 # Po [15] TIBETAN MARK INITIAL YIG MGO MDUN MA..TIBETAN MARK RGYA GRAM SHAD 0F13..0F17 # So [5] TIBETAN MARK CARET -DZUD RTAGS ME LONG CAN..TIBETAN ASTROLOGICAL SIGN SGRA GCAN -CHAR RTAGS 0F1A..0F1F # So [6] TIBETAN SIGN RDEL DKAR GCIG..TIBETAN SIGN RDEL DKAR RDEL NAG 0F34 # So TIBETAN MARK BSDUS RTAGS 0F36 # So TIBETAN MARK CARET -DZUD RTAGS BZHI MIG CAN 0F38 # So TIBETAN MARK CHE MGO 0F3A # Ps TIBETAN MARK GUG RTAGS GYON 0F3B # Pe TIBETAN MARK GUG RTAGS GYAS 0F3C # Ps TIBETAN MARK ANG KHANG GYON 0F3D # Pe TIBETAN MARK ANG KHANG GYAS 0F85 # Po TIBETAN MARK PALUTA 0FBE..0FC5 # So [8] TIBETAN KU RU KHA..TIBETAN SYMBOL RDO RJE 0FC7..0FCC # So [6] TIBETAN SYMBOL RDO RJE RGYA GRAM..TIBETAN SYMBOL NOR BU BZHI -KHYIL 0FCF # So TIBETAN SIGN RDEL NAG GSUM 0FD0..0FD1 # Po [2] TIBETAN MARK BSKA- SHOG GI MGO RGYAN..TIBETAN MARK MNYAM YIG GI MGO RGYAN 104A..104F # Po [6] MYANMAR SIGN LITTLE SECTION..MYANMAR SYMBOL GENITIVE 10FB # Po GEORGIAN PARAGRAPH SEPARATOR 1360..1368 # Po [9] ETHIOPIC SECTION MARK..ETHIOPIC PARAGRAPH SEPARATOR 1390..1399 # So [10] ETHIOPIC TONAL MARK YIZET..ETHIOPIC TONAL MARK KURT 166D..166E # Po [2] CANADIAN SYLLABICS CHI SIGN..CANADIAN SYLLABICS FULL STOP 169B # Ps OGHAM FEATHER MARK 169C # Pe OGHAM REVERSED FEATHER MARK 16EB..16ED # Po [3] RUNIC SINGLE PUNCTUATION..RUNIC CROSS PUNCTUATION 1735..1736 # Po [2] PHILIPPINE SINGLE PUNCTUATION..PHILIPPINE DOUBLE PUNCTUATION 17D4..17D6 # Po [3] KHMER SIGN KHAN..KHMER SIGN CAMNUC PII KUUH 17D8..17DA # Po [3] KHMER SIGN BEYYAL..KHMER SIGN KOOMUUT 17DB # Sc KHMER CURRENCY SYMBOL RIEL 1800..1805 # Po [6] MONGOLIAN BIRGA..MONGOLIAN FOUR DOTS 1806 # Pd MONGOLIAN TODO SOFT HYPHEN 1807..180A # Po [4] MONGOLIAN SIBE SYLLABLE BOUNDARY MARKER..MONGOLIAN NIRUGU 1940 # So LIMBU SIGN LOO 1944..1945 # Po [2] LIMBU EXCLAMATION MARK..LIMBU QUESTION MARK 19DE..19FF # So [34] NEW TAI LUE SIGN LE..KHMER SYMBOL DAP-PRAM ROC 1A1E..1A1F # Po [2] BUGINESE PALLAWA..BUGINESE END OF SECTION 1FBD # Sk GREEK KORONIS 1FBF..1FC1 # Sk [3] GREEK PSILI..GREEK DIALYTIKA AND PERISPOMENI 1FCD..1FCF # Sk [3] GREEK PSILI AND VARIA..GREEK PSILI AND PERISPOMENI 1FDD..1FDF # Sk [3] GREEK DASIA AND VARIA..GREEK DASIA AND PERISPOMENI 1FED..1FEF # Sk [3] GREEK DIALYTIKA AND VARIA..GREEK VARIA 1FFD..1FFE # Sk [2] GREEK OXIA..GREEK DASIA 207A..207C # Sm [3] SUPERSCRIPT PLUS SIGN..SUPERSCRIPT EQUALS SIGN 207D # Ps SUPERSCRIPT LEFT PARENTHESIS 207E # Pe SUPERSCRIPT RIGHT PARENTHESIS 208A..208C # Sm [3] SUBSCRIPT PLUS SIGN..SUBSCRIPT EQUALS SIGN 208D # Ps SUBSCRIPT LEFT PARENTHESIS 208E # Pe SUBSCRIPT RIGHT PARENTHESIS 20A0..20B5 # Sc [22] EURO-CURRENCY SIGN..CEDI SIGN 2100..2101 # So [2] ACCOUNT OF..ADDRESSED TO THE SUBJECT 2103..2106 # So [4] DEGREE CELSIUS..CADA UNA 2108..2109 # So [2] SCRUPLE..DEGREE FAHRENHEIT 2114 # So L B BAR SYMBOL 2116..2118 # So [3] NUMERO SIGN..SCRIPT CAPITAL P 211E..2123 # So [6] PRESCRIPTION TAKE..VERSICLE 2125 # So OUNCE SIGN 2127 # So INVERTED OHM SIGN 2129 # So TURNED GREEK SMALL LETTER IOTA 212E # So ESTIMATED SYMBOL 2132 # So TURNED CAPITAL F 213A..213B # So [2] ROTATED CAPITAL Q..FACSIMILE SIGN 2140..2144 # Sm [5] DOUBLE-STRUCK N-ARY SUMMATION..TURNED SANS-SERIF CAPITAL Y 214A # So PROPERTY LINE 214B # Sm TURNED AMPERSAND 214C # So PER SIGN 2CE5..2CEA # So [6] COPTIC SYMBOL MI RO..COPTIC SYMBOL SHIMA SIMA 2CF9..2CFC # Po [4] COPTIC OLD NUBIAN FULL STOP..COPTIC OLD NUBIAN VERSE DIVIDER 2CFE..2CFF # Po [2] COPTIC FULL STOP..COPTIC MORPHOLOGICAL DIVIDER 2E80..2E99 # So [26] CJK RADICAL REPEAT..CJK RADICAL RAP 2E9B..2EF3 # So [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE 2F00..2FD5 # So [214] KANGXI RADICAL ONE..KANGXI RADICAL FLUTE 2FF0..2FFB # So [12] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID 3004 # So JAPANESE INDUSTRIAL STANDARD SYMBOL 3036..3037 # So [2] CIRCLED POSTAL MARK..IDEOGRAPHIC TELEGRAPH LINE FEED SEPARATOR SYMBOL 303D # Po PART ALTERNATION MARK 303E..303F # So [2] IDEOGRAPHIC VARIATION INDICATOR..IDEOGRAPHIC HALF FILL SPACE 309B..309C # Sk [2] KATAKANA-HIRAGANA VOICED SOUND MARK..KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK 30A0 # Pd KATAKANA-HIRAGANA DOUBLE HYPHEN 30FB # Po KATAKANA MIDDLE DOT 3190..3191 # So [2] IDEOGRAPHIC ANNOTATION LINKING MARK..IDEOGRAPHIC ANNOTATION REVERSE MARK 3196..319F # So [10] IDEOGRAPHIC ANNOTATION TOP MARK..IDEOGRAPHIC ANNOTATION MAN MARK 31C0..31CF # So [16] CJK BASIC STROKE T..CJK BASIC STROKE N 3200..321E # So [31] PARENTHESIZED HANGUL KIYEOK..PARENTHESIZED KOREAN CHARACTER O HU 322A..3243 # So [26] PARENTHESIZED IDEOGRAPH MOON..PARENTHESIZED IDEOGRAPH REACH 3250 # So PARTNERSHIP SIGN 3260..327F # So [32] CIRCLED HANGUL KIYEOK..KOREAN STANDARD SYMBOL 328A..32B0 # So [39] CIRCLED IDEOGRAPH MOON..CIRCLED IDEOGRAPH NIGHT 32C0..32FE # So [63] IDEOGRAPHIC TELEGRAPH SYMBOL FOR JANUARY..CIRCLED KATAKANA WO 3300..33FF # So [256] SQUARE APAATO..SQUARE GAL 4DC0..4DFF # So [64] HEXAGRAM FOR THE CREATIVE HEAVEN..HEXAGRAM FOR BEFORE COMPLETION A490..A4C6 # So [55] YI RADICAL QOT..YI RADICAL KE A700..A716 # Sk [23] MODIFIER LETTER CHINESE TONE YIN PING..MODIFIER LETTER EXTRA-LOW LEFT-STEM TONE BAR A828..A82B # So [4] SYLOTI NAGRI POETRY MARK-1..SYLOTI NAGRI POETRY MARK-4 FB29 # Sm HEBREW LETTER ALTERNATIVE PLUS SIGN FDFC # Sc RIAL SIGN FDFD # So ARABIC LIGATURE BISMILLAH AR-RAHMAN AR-RAHEEM FE10..FE16 # Po [7] PRESENTATION FORM FOR VERTICAL COMMA..PRESENTATION FORM FOR VERTICAL QUESTION MARK FE17 # Ps PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR BRACKET FE18 # Pe PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET FE19 # Po PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS FE30 # Po PRESENTATION FORM FOR VERTICAL TWO DOT LEADER FE31..FE32 # Pd [2] PRESENTATION FORM FOR VERTICAL EM DASH..PRESENTATION FORM FOR VERTICAL EN DASH FE33..FE34 # Pc [2] PRESENTATION FORM FOR VERTICAL LOW LINE..PRESENTATION FORM FOR VERTICAL WAVY LOW LINE FE35 # Ps PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS FE36 # Pe PRESENTATION FORM FOR VERTICAL RIGHT PARENTHESIS FE37 # Ps PRESENTATION FORM FOR VERTICAL LEFT CURLY BRACKET FE38 # Pe PRESENTATION FORM FOR VERTICAL RIGHT CURLY BRACKET FE39 # Ps PRESENTATION FORM FOR VERTICAL LEFT TORTOISE SHELL BRACKET FE3A # Pe PRESENTATION FORM FOR VERTICAL RIGHT TORTOISE SHELL BRACKET FE3B # Ps PRESENTATION FORM FOR VERTICAL LEFT BLACK LENTICULAR BRACKET FE3C # Pe PRESENTATION FORM FOR VERTICAL RIGHT BLACK LENTICULAR BRACKET FE3D # Ps PRESENTATION FORM FOR VERTICAL LEFT DOUBLE ANGLE BRACKET FE3E # Pe PRESENTATION FORM FOR VERTICAL RIGHT DOUBLE ANGLE BRACKET FE3F # Ps PRESENTATION FORM FOR VERTICAL LEFT ANGLE BRACKET FE40 # Pe PRESENTATION FORM FOR VERTICAL RIGHT ANGLE BRACKET FE41 # Ps PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET FE42 # Pe PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET FE43 # Ps PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET FE44 # Pe PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET FE47 # Ps PRESENTATION FORM FOR VERTICAL LEFT SQUARE BRACKET FE48 # Pe PRESENTATION FORM FOR VERTICAL RIGHT SQUARE BRACKET FE49..FE4C # Po [4] DASHED OVERLINE..DOUBLE WAVY OVERLINE FE4D..FE4F # Pc [3] DASHED LOW LINE..WAVY LOW LINE FE50..FE52 # Po [3] SMALL COMMA..SMALL FULL STOP FE54..FE57 # Po [4] SMALL SEMICOLON..SMALL EXCLAMATION MARK FE58 # Pd SMALL EM DASH FE59 # Ps SMALL LEFT PARENTHESIS FE5A # Pe SMALL RIGHT PARENTHESIS FE5B # Ps SMALL LEFT CURLY BRACKET FE5C # Pe SMALL RIGHT CURLY BRACKET FE5D # Ps SMALL LEFT TORTOISE SHELL BRACKET FE5E # Pe SMALL RIGHT TORTOISE SHELL BRACKET FE5F..FE61 # Po [3] SMALL NUMBER SIGN..SMALL ASTERISK FE62 # Sm SMALL PLUS SIGN FE63 # Pd SMALL HYPHEN-MINUS FE64..FE66 # Sm [3] SMALL LESS-THAN SIGN..SMALL EQUALS SIGN FE68 # Po SMALL REVERSE SOLIDUS FE69 # Sc SMALL DOLLAR SIGN FE6A..FE6B # Po [2] SMALL PERCENT SIGN..SMALL COMMERCIAL AT FF01..FF03 # Po [3] FULLWIDTH EXCLAMATION MARK..FULLWIDTH NUMBER SIGN FF04 # Sc FULLWIDTH DOLLAR SIGN FF05..FF07 # Po [3] FULLWIDTH PERCENT SIGN..FULLWIDTH APOSTROPHE FF08 # Ps FULLWIDTH LEFT PARENTHESIS FF09 # Pe FULLWIDTH RIGHT PARENTHESIS FF0A # Po FULLWIDTH ASTERISK FF0B # Sm FULLWIDTH PLUS SIGN FF0C # Po FULLWIDTH COMMA FF0D # Pd FULLWIDTH HYPHEN-MINUS FF0E..FF0F # Po [2] FULLWIDTH FULL STOP..FULLWIDTH SOLIDUS FF1A..FF1B # Po [2] FULLWIDTH COLON..FULLWIDTH SEMICOLON FF1C..FF1E # Sm [3] FULLWIDTH LESS-THAN SIGN..FULLWIDTH GREATER-THAN SIGN FF1F..FF20 # Po [2] FULLWIDTH QUESTION MARK..FULLWIDTH COMMERCIAL AT FF3B # Ps FULLWIDTH LEFT SQUARE BRACKET FF3C # Po FULLWIDTH REVERSE SOLIDUS FF3D # Pe FULLWIDTH RIGHT SQUARE BRACKET FF3E # Sk FULLWIDTH CIRCUMFLEX ACCENT FF3F # Pc FULLWIDTH LOW LINE FF40 # Sk FULLWIDTH GRAVE ACCENT FF5B # Ps FULLWIDTH LEFT CURLY BRACKET FF5C # Sm FULLWIDTH VERTICAL LINE FF5D # Pe FULLWIDTH RIGHT CURLY BRACKET FF5E # Sm FULLWIDTH TILDE FF5F # Ps FULLWIDTH LEFT WHITE PARENTHESIS FF60 # Pe FULLWIDTH RIGHT WHITE PARENTHESIS FF61 # Po HALFWIDTH IDEOGRAPHIC FULL STOP FF62 # Ps HALFWIDTH LEFT CORNER BRACKET FF63 # Pe HALFWIDTH RIGHT CORNER BRACKET FF64..FF65 # Po [2] HALFWIDTH IDEOGRAPHIC COMMA..HALFWIDTH KATAKANA MIDDLE DOT FFE0..FFE1 # Sc [2] FULLWIDTH CENT SIGN..FULLWIDTH POUND SIGN FFE2 # Sm FULLWIDTH NOT SIGN FFE3 # Sk FULLWIDTH MACRON FFE4 # So FULLWIDTH BROKEN BAR FFE5..FFE6 # Sc [2] FULLWIDTH YEN SIGN..FULLWIDTH WON SIGN FFE8 # So HALFWIDTH FORMS LIGHT VERTICAL FFE9..FFEC # Sm [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS ARROW FFED..FFEE # So [2] HALFWIDTH BLACK SQUARE..HALFWIDTH WHITE CIRCLE FFFC..FFFD # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER 10100..10101 # Po [2] AEGEAN WORD SEPARATOR LINE..AEGEAN WORD SEPARATOR DOT 10102 # So AEGEAN CHECK MARK 10137..1013F # So [9] AEGEAN WEIGHT BASE UNIT..AEGEAN MEASURE THIRD SUBUNIT 10179..10189 # So [17] GREEK YEAR SIGN..GREEK TRYBLION BASE SIGN 1039F # Po UGARITIC WORD DIVIDER 103D0 # So OLD PERSIAN WORD DIVIDER 10A50..10A58 # Po [9] KHAROSHTHI PUNCTUATION DOT..KHAROSHTHI PUNCTUATION LINES 1D000..1D0F5 # So [246] BYZANTINE MUSICAL SYMBOL PSILI..BYZANTINE MUSICAL SYMBOL GORGON NEO KATO 1D100..1D126 # So [39] MUSICAL SYMBOL SINGLE BARLINE..MUSICAL SYMBOL DRUM CLEF-2 1D12A..1D164 # So [59] MUSICAL SYMBOL DOUBLE SHARP..MUSICAL SYMBOL ONE HUNDRED TWENTY-EIGHTH NOTE 1D16A..1D16C # So [3] MUSICAL SYMBOL FINGERED TREMOLO-1..MUSICAL SYMBOL FINGERED TREMOLO-3 1D183..1D184 # So [2] MUSICAL SYMBOL ARPEGGIATO UP..MUSICAL SYMBOL ARPEGGIATO DOWN 1D18C..1D1A9 # So [30] MUSICAL SYMBOL RINFORZANDO..MUSICAL SYMBOL DEGREE SLASH 1D1AE..1D1DD # So [48] MUSICAL SYMBOL PEDAL MARK..MUSICAL SYMBOL PES SUBPUNCTIS 1D200..1D241 # So [66] GREEK VOCAL NOTATION SYMBOL-1..GREEK INSTRUMENTAL NOTATION SYMBOL-54 1D245 # So GREEK MUSICAL LEIMMA 1D300..1D356 # So [87] MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING 1D6C1 # Sm MATHEMATICAL BOLD NABLA 1D6DB # Sm MATHEMATICAL BOLD PARTIAL DIFFERENTIAL 1D6FB # Sm MATHEMATICAL ITALIC NABLA 1D715 # Sm MATHEMATICAL ITALIC PARTIAL DIFFERENTIAL 1D735 # Sm MATHEMATICAL BOLD ITALIC NABLA 1D74F # Sm MATHEMATICAL BOLD ITALIC PARTIAL DIFFERENTIAL 1D76F # Sm MATHEMATICAL SANS-SERIF BOLD NABLA 1D789 # Sm MATHEMATICAL SANS-SERIF BOLD PARTIAL DIFFERENTIAL 1D7A9 # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC NABLA 1D7C3 # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL # Total code points: 2087
4. In both Pattern_Syntax and (GC:Symbol ∪ GC:Punctuation ∪ U+2190..U+2BFF ∪ U+2E00..U+2E7F) are:
0021..0023 # Po [3] EXCLAMATION MARK..NUMBER SIGN 0024 # Sc DOLLAR SIGN 0025..0027 # Po [3] PERCENT SIGN..APOSTROPHE 0028 # Ps LEFT PARENTHESIS 0029 # Pe RIGHT PARENTHESIS 002A # Po ASTERISK 002B # Sm PLUS SIGN 002C # Po COMMA 002D # Pd HYPHEN-MINUS 002E..002F # Po [2] FULL STOP..SOLIDUS 003A..003B # Po [2] COLON..SEMICOLON 003C..003E # Sm [3] LESS-THAN SIGN..GREATER-THAN SIGN 003F..0040 # Po [2] QUESTION MARK..COMMERCIAL AT 005B # Ps LEFT SQUARE BRACKET 005C # Po REVERSE SOLIDUS 005D # Pe RIGHT SQUARE BRACKET 005E # Sk CIRCUMFLEX ACCENT 005F # Pc LOW LINE 0060 # Sk GRAVE ACCENT 007B # Ps LEFT CURLY BRACKET 007C # Sm VERTICAL LINE 007D # Pe RIGHT CURLY BRACKET 007E # Sm TILDE 00A1 # Po INVERTED EXCLAMATION MARK 00A2..00A5 # Sc [4] CENT SIGN..YEN SIGN 00A6..00A7 # So [2] BROKEN BAR..SECTION SIGN 00A9 # So COPYRIGHT SIGN 00AB # Pi LEFT-POINTING DOUBLE ANGLE QUOTATION MARK 00AC # Sm NOT SIGN 00AE # So REGISTERED SIGN 00B0 # So DEGREE SIGN 00B1 # Sm PLUS-MINUS SIGN 00B6 # So PILCROW SIGN 00B7 # Po MIDDLE DOT 00BB # Pf RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK 00BF # Po INVERTED QUESTION MARK 00D7 # Sm MULTIPLICATION SIGN 00F7 # Sm DIVISION SIGN 2010..2015 # Pd [6] HYPHEN..HORIZONTAL BAR 2016..2017 # Po [2] DOUBLE VERTICAL LINE..DOUBLE LOW LINE 2018 # Pi LEFT SINGLE QUOTATION MARK 2019 # Pf RIGHT SINGLE QUOTATION MARK 201A # Ps SINGLE LOW-9 QUOTATION MARK 201B..201C # Pi [2] SINGLE HIGH-REVERSED-9 QUOTATION MARK..LEFT DOUBLE QUOTATION MARK 201D # Pf RIGHT DOUBLE QUOTATION MARK 201E # Ps DOUBLE LOW-9 QUOTATION MARK 201F # Pi DOUBLE HIGH-REVERSED-9 QUOTATION MARK 2020..2027 # Po [8] DAGGER..HYPHENATION POINT 2030..2038 # Po [9] PER MILLE SIGN..CARET 2039 # Pi SINGLE LEFT-POINTING ANGLE QUOTATION MARK 203A # Pf SINGLE RIGHT-POINTING ANGLE QUOTATION MARK 203B..203E # Po [4] REFERENCE MARK..OVERLINE 203F..2040 # Pc [2] UNDERTIE..CHARACTER TIE 2041..2043 # Po [3] CARET INSERTION POINT..HYPHEN BULLET 2044 # Sm FRACTION SLASH 2045 # Ps LEFT SQUARE BRACKET WITH QUILL 2046 # Pe RIGHT SQUARE BRACKET WITH QUILL 2047..2051 # Po [11] DOUBLE QUESTION MARK..TWO ASTERISKS ALIGNED VERTICALLY 2052 # Sm COMMERCIAL MINUS SIGN 2053 # Po SWUNG DASH 2054 # Pc INVERTED UNDERTIE 2055..205E # Po [10] FLOWER PUNCTUATION MARK..VERTICAL FOUR DOTS 2190..2194 # Sm [5] LEFTWARDS ARROW..LEFT RIGHT ARROW 2195..2199 # So [5] UP DOWN ARROW..SOUTH WEST ARROW 219A..219B # Sm [2] LEFTWARDS ARROW WITH STROKE..RIGHTWARDS ARROW WITH STROKE 219C..219F # So [4] LEFTWARDS WAVE ARROW..UPWARDS TWO HEADED ARROW 21A0 # Sm RIGHTWARDS TWO HEADED ARROW 21A1..21A2 # So [2] DOWNWARDS TWO HEADED ARROW..LEFTWARDS ARROW WITH TAIL 21A3 # Sm RIGHTWARDS ARROW WITH TAIL 21A4..21A5 # So [2] LEFTWARDS ARROW FROM BAR..UPWARDS ARROW FROM BAR 21A6 # Sm RIGHTWARDS ARROW FROM BAR 21A7..21AD # So [7] DOWNWARDS ARROW FROM BAR..LEFT RIGHT WAVE ARROW 21AE # Sm LEFT RIGHT ARROW WITH STROKE 21AF..21CD # So [31] DOWNWARDS ZIGZAG ARROW..LEFTWARDS DOUBLE ARROW WITH STROKE 21CE..21CF # Sm [2] LEFT RIGHT DOUBLE ARROW WITH STROKE..RIGHTWARDS DOUBLE ARROW WITH STROKE 21D0..21D1 # So [2] LEFTWARDS DOUBLE ARROW..UPWARDS DOUBLE ARROW 21D2 # Sm RIGHTWARDS DOUBLE ARROW 21D3 # So DOWNWARDS DOUBLE ARROW 21D4 # Sm LEFT RIGHT DOUBLE ARROW 21D5..21F3 # So [31] UP DOWN DOUBLE ARROW..UP DOWN WHITE ARROW 21F4..22FF # Sm [268] RIGHT ARROW WITH SMALL CIRCLE..Z NOTATION BAG MEMBERSHIP 2300..2307 # So [8] DIAMETER SIGN..WAVY LINE 2308..230B # Sm [4] LEFT CEILING..RIGHT FLOOR 230C..231F # So [20] BOTTOM RIGHT CROP..BOTTOM RIGHT CORNER 2320..2321 # Sm [2] TOP HALF INTEGRAL..BOTTOM HALF INTEGRAL 2322..2328 # So [7] FROWN..KEYBOARD 2329 # Ps LEFT-POINTING ANGLE BRACKET 232A # Pe RIGHT-POINTING ANGLE BRACKET 232B..237B # So [81] ERASE TO THE LEFT..NOT CHECK MARK 237C # Sm RIGHT ANGLE WITH DOWNWARDS ZIGZAG ARROW 237D..239A # So [30] SHOULDERED OPEN BOX..CLEAR SCREEN SYMBOL 239B..23B3 # Sm [25] LEFT PARENTHESIS UPPER HOOK..SUMMATION BOTTOM 23B4 # Ps TOP SQUARE BRACKET 23B5 # Pe BOTTOM SQUARE BRACKET 23B6 # Po BOTTOM SQUARE BRACKET OVER TOP SQUARE BRACKET 23B7..23DB # So [37] RADICAL SYMBOL BOTTOM..FUSE 23DC..23FF # Cn [36] <reserved-23DC>..<reserved-23FF> 2400..2426 # So [39] SYMBOL FOR NULL..SYMBOL FOR SUBSTITUTE FORM TWO 2427..243F # Cn [25] <reserved-2427>..<reserved-243F> 2440..244A # So [11] OCR HOOK..OCR DOUBLE BACKSLASH 244B..245F # Cn [21] <reserved-244B>..<reserved-245F> 2460..249B # No [60] CIRCLED DIGIT ONE..NUMBER TWENTY FULL STOP 249C..24E9 # So [78] PARENTHESIZED LATIN SMALL LETTER A..CIRCLED LATIN SMALL LETTER Z 24EA..24FF # No [22] CIRCLED DIGIT ZERO..NEGATIVE CIRCLED DIGIT ZERO 2500..25B6 # So [183] BOX DRAWINGS LIGHT HORIZONTAL..BLACK RIGHT-POINTING TRIANGLE 25B7 # Sm WHITE RIGHT-POINTING TRIANGLE 25B8..25C0 # So [9] BLACK RIGHT-POINTING SMALL TRIANGLE..BLACK LEFT-POINTING TRIANGLE 25C1 # Sm WHITE LEFT-POINTING TRIANGLE 25C2..25F7 # So [54] BLACK LEFT-POINTING SMALL TRIANGLE..WHITE CIRCLE WITH UPPER RIGHT QUADRANT 25F8..25FF # Sm [8] UPPER LEFT TRIANGLE..LOWER RIGHT TRIANGLE 2600..266E # So [111] BLACK SUN WITH RAYS..MUSIC NATURAL SIGN 266F # Sm MUSIC SHARP SIGN 2670..269C # So [45] WEST SYRIAC CROSS..FLEUR-DE-LIS 269D..269F # Cn [3] <reserved-269D>..<reserved-269F> 26A0..26B1 # So [18] WARNING SIGN..FUNERAL URN 26B2..2700 # Cn [79] <reserved-26B2>..<reserved-2700> 2701..2704 # So [4] UPPER BLADE SCISSORS..WHITE SCISSORS 2705 # Cn <reserved-2705> 2706..2709 # So [4] TELEPHONE LOCATION SIGN..ENVELOPE 270A..270B # Cn [2] <reserved-270A>..<reserved-270B> 270C..2727 # So [28] VICTORY HAND..WHITE FOUR POINTED STAR 2728 # Cn <reserved-2728> 2729..274B # So [35] STRESS OUTLINED WHITE STAR..HEAVY EIGHT TEARDROP-SPOKED PROPELLER ASTERISK 274C # Cn <reserved-274C> 274D # So SHADOWED WHITE CIRCLE 274E # Cn <reserved-274E> 274F..2752 # So [4] LOWER RIGHT DROP-SHADOWED WHITE SQUARE..UPPER RIGHT SHADOWED WHITE SQUARE 2753..2755 # Cn [3] <reserved-2753>..<reserved-2755> 2756 # So BLACK DIAMOND MINUS WHITE X 2757 # Cn <reserved-2757> 2758..275E # So [7] LIGHT VERTICAL BAR..HEAVY DOUBLE COMMA QUOTATION MARK ORNAMENT 275F..2760 # Cn [2] <reserved-275F>..<reserved-2760> 2761..2767 # So [7] CURVED STEM PARAGRAPH SIGN ORNAMENT..ROTATED FLORAL HEART BULLET 2768 # Ps MEDIUM LEFT PARENTHESIS ORNAMENT 2769 # Pe MEDIUM RIGHT PARENTHESIS ORNAMENT 276A # Ps MEDIUM FLATTENED LEFT PARENTHESIS ORNAMENT 276B # Pe MEDIUM FLATTENED RIGHT PARENTHESIS ORNAMENT 276C # Ps MEDIUM LEFT-POINTING ANGLE BRACKET ORNAMENT 276D # Pe MEDIUM RIGHT-POINTING ANGLE BRACKET ORNAMENT 276E # Ps HEAVY LEFT-POINTING ANGLE QUOTATION MARK ORNAMENT 276F # Pe HEAVY RIGHT-POINTING ANGLE QUOTATION MARK ORNAMENT 2770 # Ps HEAVY LEFT-POINTING ANGLE BRACKET ORNAMENT 2771 # Pe HEAVY RIGHT-POINTING ANGLE BRACKET ORNAMENT 2772 # Ps LIGHT LEFT TORTOISE SHELL BRACKET ORNAMENT 2773 # Pe LIGHT RIGHT TORTOISE SHELL BRACKET ORNAMENT 2774 # Ps MEDIUM LEFT CURLY BRACKET ORNAMENT 2775 # Pe MEDIUM RIGHT CURLY BRACKET ORNAMENT 2776..2793 # No [30] DINGBAT NEGATIVE CIRCLED DIGIT ONE..DINGBAT NEGATIVE CIRCLED SANS-SERIF NUMBER TEN 2794 # So HEAVY WIDE-HEADED RIGHTWARDS ARROW 2795..2797 # Cn [3] <reserved-2795>..<reserved-2797> 2798..27AF # So [24] HEAVY SOUTH EAST ARROW..NOTCHED LOWER RIGHT-SHADOWED WHITE RIGHTWARDS ARROW 27B0 # Cn <reserved-27B0> 27B1..27BE # So [14] NOTCHED UPPER RIGHT-SHADOWED WHITE RIGHTWARDS ARROW..OPEN-OUTLINED RIGHTWARDS ARROW 27BF # Cn <reserved-27BF> 27C0..27C4 # Sm [5] THREE DIMENSIONAL ANGLE..OPEN SUPERSET 27C5 # Ps LEFT S-SHAPED BAG DELIMITER 27C6 # Pe RIGHT S-SHAPED BAG DELIMITER 27C7..27CF # Cn [9] <reserved-27C7>..<reserved-27CF> 27D0..27E5 # Sm [22] WHITE DIAMOND WITH CENTRED DOT..WHITE SQUARE WITH RIGHTWARDS TICK 27E6 # Ps MATHEMATICAL LEFT WHITE SQUARE BRACKET 27E7 # Pe MATHEMATICAL RIGHT WHITE SQUARE BRACKET 27E8 # Ps MATHEMATICAL LEFT ANGLE BRACKET 27E9 # Pe MATHEMATICAL RIGHT ANGLE BRACKET 27EA # Ps MATHEMATICAL LEFT DOUBLE ANGLE BRACKET 27EB # Pe MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET 27EC..27EF # Cn [4] <reserved-27EC>..<reserved-27EF> 27F0..27FF # Sm [16] UPWARDS QUADRUPLE ARROW..LONG RIGHTWARDS SQUIGGLE ARROW 2800..28FF # So [256] BRAILLE PATTERN BLANK..BRAILLE PATTERN DOTS-12345678 2900..2982 # Sm [131] RIGHTWARDS TWO-HEADED ARROW WITH VERTICAL STROKE..Z NOTATION TYPE COLON 2983 # Ps LEFT WHITE CURLY BRACKET 2984 # Pe RIGHT WHITE CURLY BRACKET 2985 # Ps LEFT WHITE PARENTHESIS 2986 # Pe RIGHT WHITE PARENTHESIS 2987 # Ps Z NOTATION LEFT IMAGE BRACKET 2988 # Pe Z NOTATION RIGHT IMAGE BRACKET 2989 # Ps Z NOTATION LEFT BINDING BRACKET 298A # Pe Z NOTATION RIGHT BINDING BRACKET 298B # Ps LEFT SQUARE BRACKET WITH UNDERBAR 298C # Pe RIGHT SQUARE BRACKET WITH UNDERBAR 298D # Ps LEFT SQUARE BRACKET WITH TICK IN TOP CORNER 298E # Pe RIGHT SQUARE BRACKET WITH TICK IN BOTTOM CORNER 298F # Ps LEFT SQUARE BRACKET WITH TICK IN BOTTOM CORNER 2990 # Pe RIGHT SQUARE BRACKET WITH TICK IN TOP CORNER 2991 # Ps LEFT ANGLE BRACKET WITH DOT 2992 # Pe RIGHT ANGLE BRACKET WITH DOT 2993 # Ps LEFT ARC LESS-THAN BRACKET 2994 # Pe RIGHT ARC GREATER-THAN BRACKET 2995 # Ps DOUBLE LEFT ARC GREATER-THAN BRACKET 2996 # Pe DOUBLE RIGHT ARC LESS-THAN BRACKET 2997 # Ps LEFT BLACK TORTOISE SHELL BRACKET 2998 # Pe RIGHT BLACK TORTOISE SHELL BRACKET 2999..29D7 # Sm [63] DOTTED FENCE..BLACK HOURGLASS 29D8 # Ps LEFT WIGGLY FENCE 29D9 # Pe RIGHT WIGGLY FENCE 29DA # Ps LEFT DOUBLE WIGGLY FENCE 29DB # Pe RIGHT DOUBLE WIGGLY FENCE 29DC..29FB # Sm [32] INCOMPLETE INFINITY..TRIPLE PLUS 29FC # Ps LEFT-POINTING CURVED ANGLE BRACKET 29FD # Pe RIGHT-POINTING CURVED ANGLE BRACKET 29FE..2AFF # Sm [258] TINY..N-ARY WHITE VERTICAL BAR 2B00..2B13 # So [20] NORTH EAST WHITE ARROW..SQUARE WITH BOTTOM HALF BLACK 2B14..2BFF # Cn [236] <reserved-2B14>..<reserved-2BFF> 2E00..2E01 # Po [2] RIGHT ANGLE SUBSTITUTION MARKER..RIGHT ANGLE DOTTED SUBSTITUTION MARKER 2E02 # Ps LEFT SUBSTITUTION BRACKET 2E03 # Pe RIGHT SUBSTITUTION BRACKET 2E04 # Ps LEFT DOTTED SUBSTITUTION BRACKET 2E05 # Pe RIGHT DOTTED SUBSTITUTION BRACKET 2E06..2E08 # Po [3] RAISED INTERPOLATION MARKER..DOTTED TRANSPOSITION MARKER 2E09 # Ps LEFT TRANSPOSITION BRACKET 2E0A # Pe RIGHT TRANSPOSITION BRACKET 2E0B # Po RAISED SQUARE 2E0C # Pi LEFT RAISED OMISSION BRACKET 2E0D # Pf RIGHT RAISED OMISSION BRACKET 2E0E..2E16 # Po [9] EDITORIAL CORONIS..DOTTED RIGHT-POINTING ANGLE 2E17 # Pd DOUBLE OBLIQUE HYPHEN 2E18..2E1B # Cn [4] <reserved-2E18>..<reserved-2E1B> 2E1C # Ps LEFT LOW PARAPHRASE BRACKET 2E1D # Pe RIGHT LOW PARAPHRASE BRACKET 2E1E..2E7F # Cn [98] <reserved-2E1E>..<reserved-2E7F> 3001..3003 # Po [3] IDEOGRAPHIC COMMA..DITTO MARK 3008 # Ps LEFT ANGLE BRACKET 3009 # Pe RIGHT ANGLE BRACKET 300A # Ps LEFT DOUBLE ANGLE BRACKET 300B # Pe RIGHT DOUBLE ANGLE BRACKET 300C # Ps LEFT CORNER BRACKET 300D # Pe RIGHT CORNER BRACKET 300E # Ps LEFT WHITE CORNER BRACKET 300F # Pe RIGHT WHITE CORNER BRACKET 3010 # Ps LEFT BLACK LENTICULAR BRACKET 3011 # Pe RIGHT BLACK LENTICULAR BRACKET 3012..3013 # So [2] POSTAL MARK..GETA MARK 3014 # Ps LEFT TORTOISE SHELL BRACKET 3015 # Pe RIGHT TORTOISE SHELL BRACKET 3016 # Ps LEFT WHITE LENTICULAR BRACKET 3017 # Pe RIGHT WHITE LENTICULAR BRACKET 3018 # Ps LEFT WHITE TORTOISE SHELL BRACKET 3019 # Pe RIGHT WHITE TORTOISE SHELL BRACKET 301A # Ps LEFT WHITE SQUARE BRACKET 301B # Pe RIGHT WHITE SQUARE BRACKET 301C # Pd WAVE DASH 301D # Ps REVERSED DOUBLE PRIME QUOTATION MARK 301E..301F # Pe [2] DOUBLE PRIME QUOTATION MARK..LOW DOUBLE PRIME QUOTATION MARK 3020 # So POSTAL MARK FACE 3030 # Pd WAVY DASH FD3E # Ps ORNATE LEFT PARENTHESIS FD3F # Pe ORNATE RIGHT PARENTHESIS FE45..FE46 # Po [2] SESAME DOT..WHITE SESAME DOT # Total code points: 2955