Unicode Utilities: Character Property Index

help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | idna | languageid

CategoryDatatypeSourcePropertyValues
BidirectionalBinaryUCDBidi_ControlNo (N),
Yes (Y)
Bidi_MirroredNo (N),
Yes (Y)
EnumeratedBidi_ClassShow Values
Bidi_Paired_Bracket_TypeClose,
None,
Open
StringBidi_Mirroring_GlyphShow Values
CaseBinaryUCDCase_IgnorableNo (N),
Yes (Y)
CasedNo (N),
Yes (Y)
Changes_When_CasefoldedNo (N),
Yes (Y)
Changes_When_CasemappedNo (N),
Yes (Y)
Changes_When_LowercasedNo (N),
Yes (Y)
Changes_When_TitlecasedNo (N),
Yes (Y)
Changes_When_UppercasedNo (N),
Yes (Y)
LowercaseNo (N),
Yes (Y)
Soft_DottedNo (N),
Yes (Y)
UppercaseNo (N),
Yes (Y)
UnicodeisCasedNo (N),
Yes (Y)
isCasefoldedNo (N),
Yes (Y)
isLowercaseNo (N),
Yes (Y)
isTitlecaseNo (N),
Yes (Y)
isUppercaseNo (N),
Yes (Y)
X-ICUCase_SensitiveNo (N),
Yes (Y)
StringUCDCase_FoldingShow Values
Lowercase_MappingShow Values
Simple_Case_FoldingShow Values
Simple_Lowercase_MappingShow Values
Simple_Titlecase_MappingShow Values
Simple_Uppercase_MappingShow Values
Titlecase_MappingShow Values
Uppercase_MappingShow Values
UnicodetoCasefoldShow Values
toLowercaseShow Values
toTitlecaseShow Values
toUppercaseShow Values
CJKBinaryUCDIDS_Binary_OperatorNo (N),
Yes (Y)
IDS_Trinary_OperatorNo (N),
Yes (Y)
IdeographicNo (N),
Yes (Y)
RadicalNo (N),
Yes (Y)
Unified_IdeographNo (N),
Yes (Y)
EnumeratedX-DemoHanTypeHan, Hans, Hant,
na
GeneralBinaryUCDAlphabeticNo (N),
Yes (Y)
Default_Ignorable_Code_PointNo (N),
Yes (Y)
DeprecatedNo (N),
Yes (Y)
Logical_Order_ExceptionNo (N),
Yes (Y)
Noncharacter_Code_PointNo (N),
Yes (Y)
Variation_SelectorNo (N),
Yes (Y)
White_SpaceNo (N),
Yes (Y)
CatalogAgeShow Values
BlockAegean_Numbers (Aegean_Numbers), Ahom (Ahom), Alchemical_Symbols (Alchemical), Alphabetic_Presentation_Forms (Alphabetic_PF), Anatolian_Hieroglyphs (Anatolian_Hieroglyphs), Ancient_Greek_Musical_Notation (Ancient_Greek_Music), Ancient_Greek_Numbers (Ancient_Greek_Numbers), Ancient_Symbols (Ancient_Symbols), Arabic (Arabic), Arabic_Extended_A (Arabic_Ext_A), Arabic_Mathematical_Alphabetic_Symbols (Arabic_Math), Arabic_Presentation_Forms_A (Arabic_PF_A), Arabic_Presentation_Forms_B (Arabic_PF_B), Arabic_Supplement (Arabic_Sup), Armenian (Armenian), Arrows (Arrows), Avestan (Avestan),
Balinese (Balinese), Bamum (Bamum), Bamum_Supplement (Bamum_Sup), Basic_Latin (ASCII), Bassa_Vah (Bassa_Vah), Batak (Batak), Bengali (Bengali), Block_Elements (Block_Elements), Bopomofo (Bopomofo), Bopomofo_Extended (Bopomofo_Ext), Box_Drawing (Box_Drawing), Brahmi (Brahmi), Braille_Patterns (Braille), Buginese (Buginese), Buhid (Buhid), Byzantine_Musical_Symbols (Byzantine_Music),
Carian (Carian), Caucasian_Albanian (Caucasian_Albanian), Chakma (Chakma), Cham (Cham), Cherokee (Cherokee), Cherokee_Supplement (Cherokee_Sup), CJK_Compatibility (CJK_Compat), CJK_Compatibility_Forms (CJK_Compat_Forms), CJK_Compatibility_Ideographs (CJK_Compat_Ideographs), CJK_Compatibility_Ideographs_Supplement (CJK_Compat_Ideographs_Sup), CJK_Radicals_Supplement (CJK_Radicals_Sup), CJK_Strokes (CJK_Strokes), CJK_Symbols_And_Punctuation (CJK_Symbols), CJK_Unified_Ideographs (CJK), CJK_Unified_Ideographs_Extension_A (CJK_Ext_A), CJK_Unified_Ideographs_Extension_B (CJK_Ext_B), CJK_Unified_Ideographs_Extension_C (CJK_Ext_C), CJK_Unified_Ideographs_Extension_D (CJK_Ext_D), CJK_Unified_Ideographs_Extension_E (CJK_Ext_E), Combining_Diacritical_Marks (Diacriticals), Combining_Diacritical_Marks_Extended (Diacriticals_Ext), Combining_Diacritical_Marks_For_Symbols (Diacriticals_For_Symbols), Combining_Diacritical_Marks_Supplement (Diacriticals_Sup), Combining_Half_Marks (Half_Marks), Common_Indic_Number_Forms (Indic_Number_Forms), Control_Pictures (Control_Pictures), Coptic (Coptic), Coptic_Epact_Numbers (Coptic_Epact_Numbers), Counting_Rod_Numerals (Counting_Rod), Cuneiform (Cuneiform), Cuneiform_Numbers_And_Punctuation (Cuneiform_Numbers), Currency_Symbols (Currency_Symbols), Cypriot_Syllabary (Cypriot_Syllabary), Cyrillic (Cyrillic), Cyrillic_Extended_A (Cyrillic_Ext_A), Cyrillic_Extended_B (Cyrillic_Ext_B), Cyrillic_Supplement (Cyrillic_Sup),
Deseret (Deseret), Devanagari (Devanagari), Devanagari_Extended (Devanagari_Ext), Dingbats (Dingbats), Domino_Tiles (Domino), Duployan (Duployan),
Early_Dynastic_Cuneiform (Early_Dynastic_Cuneiform), Egyptian_Hieroglyphs (Egyptian_Hieroglyphs), Elbasan (Elbasan), Emoticons (Emoticons), Enclosed_Alphanumeric_Supplement (Enclosed_Alphanum_Sup), Enclosed_Alphanumerics (Enclosed_Alphanum), Enclosed_CJK_Letters_And_Months (Enclosed_CJK), Enclosed_Ideographic_Supplement (Enclosed_Ideographic_Sup), Ethiopic (Ethiopic), Ethiopic_Extended (Ethiopic_Ext), Ethiopic_Extended_A (Ethiopic_Ext_A), Ethiopic_Supplement (Ethiopic_Sup),
General_Punctuation (Punctuation), Geometric_Shapes (Geometric_Shapes), Geometric_Shapes_Extended (Geometric_Shapes_Ext), Georgian (Georgian), Georgian_Supplement (Georgian_Sup), Glagolitic (Glagolitic), Gothic (Gothic), Grantha (Grantha), Greek_And_Coptic (Greek), Greek_Extended (Greek_Ext), Gujarati (Gujarati), Gurmukhi (Gurmukhi),
Halfwidth_And_Fullwidth_Forms (Half_And_Full_Forms), Hangul_Compatibility_Jamo (Compat_Jamo), Hangul_Jamo (Jamo), Hangul_Jamo_Extended_A (Jamo_Ext_A), Hangul_Jamo_Extended_B (Jamo_Ext_B), Hangul_Syllables (Hangul), Hanunoo (Hanunoo), Hatran (Hatran), Hebrew (Hebrew), High_Private_Use_Surrogates (High_PU_Surrogates), High_Surrogates (High_Surrogates), Hiragana (Hiragana),
Ideographic_Description_Characters (IDC), Imperial_Aramaic (Imperial_Aramaic), Inscriptional_Pahlavi (Inscriptional_Pahlavi), Inscriptional_Parthian (Inscriptional_Parthian), IPA_Extensions (IPA_Ext),
Javanese (Javanese),
Kaithi (Kaithi), Kana_Supplement (Kana_Sup), Kanbun (Kanbun), Kangxi_Radicals (Kangxi), Kannada (Kannada), Katakana (Katakana), Katakana_Phonetic_Extensions (Katakana_Ext), Kayah_Li (Kayah_Li), Kharoshthi (Kharoshthi), Khmer (Khmer), Khmer_Symbols (Khmer_Symbols), Khojki (Khojki), Khudawadi (Khudawadi),
Lao (Lao), Latin_1_Supplement (Latin_1_Sup), Latin_Extended_A (Latin_Ext_A), Latin_Extended_Additional (Latin_Ext_Additional), Latin_Extended_B (Latin_Ext_B), Latin_Extended_C (Latin_Ext_C), Latin_Extended_D (Latin_Ext_D), Latin_Extended_E (Latin_Ext_E), Lepcha (Lepcha), Letterlike_Symbols (Letterlike_Symbols), Limbu (Limbu), Linear_A (Linear_A), Linear_B_Ideograms (Linear_B_Ideograms), Linear_B_Syllabary (Linear_B_Syllabary), Lisu (Lisu), Low_Surrogates (Low_Surrogates), Lycian (Lycian), Lydian (Lydian),
Mahajani (Mahajani), Mahjong_Tiles (Mahjong), Malayalam (Malayalam), Mandaic (Mandaic), Manichaean (Manichaean), Mathematical_Alphanumeric_Symbols (Math_Alphanum), Mathematical_Operators (Math_Operators), Meetei_Mayek (Meetei_Mayek), Meetei_Mayek_Extensions (Meetei_Mayek_Ext), Mende_Kikakui (Mende_Kikakui), Meroitic_Cursive (Meroitic_Cursive), Meroitic_Hieroglyphs (Meroitic_Hieroglyphs), Miao (Miao), Miscellaneous_Mathematical_Symbols_A (Misc_Math_Symbols_A), Miscellaneous_Mathematical_Symbols_B (Misc_Math_Symbols_B), Miscellaneous_Symbols (Misc_Symbols), Miscellaneous_Symbols_And_Arrows (Misc_Arrows), Miscellaneous_Symbols_And_Pictographs (Misc_Pictographs), Miscellaneous_Technical (Misc_Technical), Modi (Modi), Modifier_Tone_Letters (Modifier_Tone_Letters), Mongolian (Mongolian), Mro (Mro), Multani (Multani), Musical_Symbols (Music), Myanmar (Myanmar), Myanmar_Extended_A (Myanmar_Ext_A), Myanmar_Extended_B (Myanmar_Ext_B),
Nabataean (Nabataean), New_Tai_Lue (New_Tai_Lue), NKo (NKo), No_Block (NB), Number_Forms (Number_Forms),
Ogham (Ogham), Ol_Chiki (Ol_Chiki), Old_Hungarian (Old_Hungarian), Old_Italic (Old_Italic), Old_North_Arabian (Old_North_Arabian), Old_Permic (Old_Permic), Old_Persian (Old_Persian), Old_South_Arabian (Old_South_Arabian), Old_Turkic (Old_Turkic), Optical_Character_Recognition (OCR), Oriya (Oriya), Ornamental_Dingbats (Ornamental_Dingbats), Osmanya (Osmanya),
Pahawh_Hmong (Pahawh_Hmong), Palmyrene (Palmyrene), Pau_Cin_Hau (Pau_Cin_Hau), Phags_Pa (Phags_Pa), Phaistos_Disc (Phaistos), Phoenician (Phoenician), Phonetic_Extensions (Phonetic_Ext), Phonetic_Extensions_Supplement (Phonetic_Ext_Sup), Playing_Cards (Playing_Cards), Private_Use_Area (PUA), Psalter_Pahlavi (Psalter_Pahlavi),
Rejang (Rejang), Rumi_Numeral_Symbols (Rumi), Runic (Runic),
Samaritan (Samaritan), Saurashtra (Saurashtra), Sharada (Sharada), Shavian (Shavian), Shorthand_Format_Controls (Shorthand_Format_Controls), Siddham (Siddham), Sinhala (Sinhala), Sinhala_Archaic_Numbers (Sinhala_Archaic_Numbers), Small_Form_Variants (Small_Forms), Sora_Sompeng (Sora_Sompeng), Spacing_Modifier_Letters (Modifier_Letters), Specials (Specials), Sundanese (Sundanese), Sundanese_Supplement (Sundanese_Sup), Superscripts_And_Subscripts (Super_And_Sub), Supplemental_Arrows_A (Sup_Arrows_A), Supplemental_Arrows_B (Sup_Arrows_B), Supplemental_Arrows_C (Sup_Arrows_C), Supplemental_Mathematical_Operators (Sup_Math_Operators), Supplemental_Punctuation (Sup_Punctuation), Supplemental_Symbols_And_Pictographs (Sup_Symbols_And_Pictographs), Supplementary_Private_Use_Area_A (Sup_PUA_A), Supplementary_Private_Use_Area_B (Sup_PUA_B), Sutton_Sign_Writing (Sutton_Sign_Writing), Syloti_Nagri (Syloti_Nagri), Syriac (Syriac),
Tagalog (Tagalog), Tagbanwa (Tagbanwa), Tags (Tags), Tai_Le (Tai_Le), Tai_Tham (Tai_Tham), Tai_Viet (Tai_Viet), Tai_Xuan_Jing_Symbols (Tai_Xuan_Jing), Takri (Takri), Tamil (Tamil), Telugu (Telugu), Thaana (Thaana), Thai (Thai), Tibetan (Tibetan), Tifinagh (Tifinagh), Tirhuta (Tirhuta), Transport_And_Map_Symbols (Transport_And_Map),
Ugaritic (Ugaritic), Unified_Canadian_Aboriginal_Syllabics (UCAS), Unified_Canadian_Aboriginal_Syllabics_Extended (UCAS_Ext),
Vai (Vai) too many values to show
ScriptShow Values
EnumeratedGeneral_CategoryShow Values
Hangul_Syllable_TypeLeading_Jamo (L), LV_Syllable (LV), LVT_Syllable (LVT),
Not_Applicable (NA),
Trailing_Jamo (T),
Vowel_Jamo (V)
StringNameShow Values
Script_ExtensionsShow Values
subheadShow Values
IdentifiersBinaryUCDID_ContinueNo (N),
Yes (Y)
ID_StartNo (N),
Yes (Y)
Pattern_SyntaxNo (N),
Yes (Y)
Pattern_White_SpaceNo (N),
Yes (Y)
XID_ContinueNo (N),
Yes (Y)
XID_StartNo (N),
Yes (Y)
MiscellaneousBinaryUCDDashNo (N),
Yes (Y)
DiacriticNo (N),
Yes (Y)
ExtenderNo (N),
Yes (Y)
Grapheme_BaseNo (N),
Yes (Y)
Grapheme_ExtendNo (N),
Yes (Y)
Grapheme_LinkNo (N),
Yes (Y)
HyphenNo (N),
Yes (Y)
MathNo (N),
Yes (Y)
Quotation_MarkNo (N),
Yes (Y)
STermNo (N),
Yes (Y)
Terminal_PunctuationNo (N),
Yes (Y)
MiscellaneousISO_CommentShow Values
Unicode_1_NameShow Values
NormalizationBinaryUCDChanges_When_NFKC_CasefoldedNo (N),
Yes (Y)
Full_Composition_ExclusionNo (N),
Yes (Y)
UnicodeisNFCNo,
Yes
isNFDNo,
Yes
isNFKCNo,
Yes
isNFKDNo,
Yes
X-ICUNFC_InertNo (N),
Yes (Y)
NFD_InertNo (N),
Yes (Y)
NFKC_InertNo (N),
Yes (Y)
NFKD_InertNo (N),
Yes (Y)
EnumeratedUCDCanonical_Combining_ClassShow Values
Decomposition_TypeShow Values
NFC_Quick_CheckMaybe (M),
No (N),
Yes (Y)
NFD_Quick_CheckNo (N),
Yes (Y)
NFKC_Quick_CheckMaybe (M),
No (N),
Yes (Y)
NFKD_Quick_CheckNo (N),
Yes (Y)
X-ICULead_Canonical_Combining_ClassShow Values
Trail_Canonical_Combining_ClassShow Values
StringUCDNFKC_CasefoldShow Values
UnicodetoNfcShow Values
toNfdShow Values
toNfkcShow Values
toNfkdShow Values
NumericBinaryUCDASCII_Hex_DigitNo (N),
Yes (Y)
Hex_DigitNo (N),
Yes (Y)
EnumeratedNumeric_TypeDecimal (De), Digit (Di),
None (None), Numeric (Nu)
NumericNumeric_ValueShow Values
Shaping and RenderingBinaryUCDJoin_ControlNo (N),
Yes (Y)
X-ICUSegment_StarterNo (N),
Yes (Y)
EnumeratedUCDEast_Asian_WidthAmbiguous (A),
Fullwidth (F),
Halfwidth (H),
Narrow (Na), Neutral (N),
Wide (W)
Grapheme_Cluster_BreakShow Values
Joining_GroupShow Values
Joining_TypeDual_Joining (D),
Join_Causing (C),
Left_Joining (L),
Non_Joining (U),
Right_Joining (R),
Transparent (T)
Line_BreakShow Values
Sentence_BreakShow Values
Word_BreakShow Values
X-EmojiEnumeratedUTRemojiface, flag,
group,
keycap,
modifier,
no,
other,
primary,
secondary
X-IDNABinaryUTSidna2008CONTEXTJ, CONTEXTO,
DISALLOWED,
PVALID,
UNASSIGNED
Enumeratedidna2003deviation, disallowed,
ignored,
mapped,
valid
idna2008cdeviation, disallowed,
ignored,
mapped,
valid
uts46deviation, disallowed,
ignored,
mapped,
valid
StringtoIdna2003Show Values
toUts46nShow Values
toUts46tShow Values
X-RegexBinaryUTSANYNo,
Yes
ASCIINo,
Yes
alnumNo (N),
Yes (Y)
blankNo (N),
Yes (Y)
bmpNo,
Yes
graphNo (N),
Yes (Y)
printNo (N),
Yes (Y)
xdigitNo (N),
Yes (Y)
X-SecurityEnumeratedUTSIdentifier_StatusAllowed (Allowed),
Restricted (Restricted)
Identifier_TypeShow Values
confusableShow Values
X-UCABinaryUTSucaShow Values
uca2Show Values
uca2.5Show Values
uca3Show Values

Key

The Categories are from UCD Table 8. Property Summary Table, with some extended categories: X-Encoding, X-IDNA, X-Regex, and X-Security.

The Datatypes are from UCD Table 5. Property Type Key.

The Sources are:


Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are: Unicode Fonts for Ancient Scripts, Noto Fonts site, Large, multi-script Unicode fonts. See also: Unicode Display Problems.

Version 3.7; ICU version: 56.0.1.0; Unicode version: 8.0.0.0