This set of charts shows the Unicode Collation Algorithm values for Unicode characters. The characters are arranged in the following groups:
| Null | Completely ignoreable (primary, secondary and tertiary levels) These include control codes and various formatting codes. |
|---|---|
| Ignorable | Ignorable at a primary level, but not at a secondary or
tertiary level. These include most accents and diacritics. |
| Variable | Characters that may be set to ignorable by a programmatic
switch. These include spaces, punctuation marks, and most symbols. |
| Common | Characters that are none of the above, but not considered
letters. These include numbers, currency symbols, etc. |
| Letters | According to script |
| Unsupported | Not explicitly supported in this version of UCA; uses code-point order |
The characters* within each group are arranged in cells. The color of the cell indicates the strength of the difference between that character and the previous character in the chart, as follows.
| No Expansion | Expansion | |||
|---|---|---|---|---|
| a 0061 |
Primary difference | dz 01F3 |
Primary difference | |
| á 00E1 |
Secondary Difference | DZ 01F1 |
Secondary Difference | |
| A 0041 |
Tertiary difference | Dz 01F2 |
Tertiary difference | |
| Å 212B |
Quarternary difference or no difference |
Quarternary difference or no difference |
||
Note: If tool-tips are enabled in your browser, then if you pause the mouse over any cell, you will see the name of the character and a representation of the sort key. In this representation, the separators between the weight levels are represented with "|".
| * | In some cases, the UCA data table also includes contractions. They can be recognized by the multiple code point numbers, as in the following: |
ஔ 0B92 0BD7 |
|---|