This set of charts shows the Unicode Collation Algorithm values for Unicode characters. The characters are arranged in the following groups:
Null | Completely ignoreable (primary, secondary and tertiary levels) These include control codes and various formatting codes. |
---|---|
Ignorable | Ignorable at a primary level, but not at a secondary or
tertiary level. These include most accents and diacritics. |
Variable | Characters that may be set to ignorable by a programmatic
switch. These include spaces, punctuation marks, and most symbols. |
Common | Characters that are none of the above, but not considered
letters. These include numbers, currency symbols, etc. |
Letters | According to script |
Unsupported | Not explicitly supported in this version of UCA; uses code-point order |
The characters* within each group are arranged in cells. The characters added in the last release are in yellow. Otherwise, the color of the cell indicates the strength of the difference between that character and the previous character in the chart, as follows.
No Expansion | Expansion | |||
---|---|---|---|---|
a 0061 |
Primary difference | dz 01F3 |
Primary difference | |
á 00E1 |
Secondary Difference | DZ 01F1 |
Secondary Difference | |
A 0041 |
Tertiary difference | Dz 01F2 |
Tertiary difference | |
Å 212B |
Quarternary difference or no difference |
Quarternary difference or no difference |
Note: If tooltips are enabled in your browser, then if you pause the mouse over any cell, you will see the name of the character and a representation of the sort key. In this representation, the separators between the weight levels are represented with "|".
* | In some cases, the UCA data table also includes contractions. They can be recognized by the multiple code point numbers, as in the following: |
ஔ 0B92 0BD7 |
---|
To properly view these charts, you need to have an updated browser, and Unicode fonts (such as the Noto Fonts) that cover the characters you are interested in.
© 2003–2024 Unicode, Inc. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc., in the U.S. and other countries. See Terms of Use.