Unicode Utilities: Confusables

Properties use ICU for Unicode V11.0; the beta properties support Unicode V12.0β. For more information, see Unicode Utilities Beta.

help | character | properties | confusables | unicode-set | compare-sets | regex | bnf-regex | breaks | transform | bidi | bidi-c | idna | languageid

Input With this demo, you can supply an Input string and see the combinations that are confusable with it, using data collected by the Unicode consortium. You can also try different restrictions, using characters valid in different approaches to international domain names. For more info, see Data below.
  

Confusable Characters

o ο σ о օ ס ه ٥ ھ ہ ە ۵ 𐐬
006F03BF03C3043E058505E10647066506BE06C106D506F509660A660AE60C020C660C820CE60D020D200D660D820E500ED0101D10401042C
LATIN SMALL LETTER OGREEK SMALL LETTER OMICRONGREEK SMALL LETTER SIGMACYRILLIC SMALL LETTER OARMENIAN SMALL LETTER OHHEBREW LETTER SAMEKHARABIC LETTER HEHARABIC-INDIC DIGIT FIVEARABIC LETTER HEH DOACHASHMEEARABIC LETTER HEH GOALARABIC LETTER AEEXTENDED ARABIC-INDIC DIGIT FIVEDEVANAGARI DIGIT ZEROGURMUKHI DIGIT ZEROGUJARATI DIGIT ZEROTELUGU SIGN ANUSVARATELUGU DIGIT ZEROKANNADA SIGN ANUSVARAKANNADA DIGIT ZEROMALAYALAM SIGN ANUSVARAMALAYALAM LETTER TTHAMALAYALAM DIGIT ZEROSINHALA SIGN ANUSVARAYATHAI DIGIT ZEROLAO DIGIT ZEROMYANMAR LETTER WAMYANMAR DIGIT ZERODESERET SMALL LETTER LONG O
o ο σ о օ ס ه ٥ ھ ہ ە ۵ 𐐬
006F03BF03C3043E058505E10647066506BE06C106D506F509660A660AE60C020C660C820CE60D020D200D660D820E500ED0101D10401042C
LATIN SMALL LETTER OGREEK SMALL LETTER OMICRONGREEK SMALL LETTER SIGMACYRILLIC SMALL LETTER OARMENIAN SMALL LETTER OHHEBREW LETTER SAMEKHARABIC LETTER HEHARABIC-INDIC DIGIT FIVEARABIC LETTER HEH DOACHASHMEEARABIC LETTER HEH GOALARABIC LETTER AEEXTENDED ARABIC-INDIC DIGIT FIVEDEVANAGARI DIGIT ZEROGURMUKHI DIGIT ZEROGUJARATI DIGIT ZEROTELUGU SIGN ANUSVARATELUGU DIGIT ZEROKANNADA SIGN ANUSVARAKANNADA DIGIT ZEROMALAYALAM SIGN ANUSVARAMALAYALAM LETTER TTHAMALAYALAM DIGIT ZEROSINHALA SIGN ANUSVARAYATHAI DIGIT ZEROLAO DIGIT ZEROMYANMAR LETTER WAMYANMAR DIGIT ZERODESERET SMALL LETTER LONG O
 ́   ֜   ֝   َ   ݇   ॔                        
0301059C059D064E07470954
COMBINING ACUTE ACCENTHEBREW ACCENT GERESHHEBREW ACCENT GERESH MUQDAMARABIC FATHASYRIAC OBLIQUE LINE ABOVEDEVANAGARI ACUTE ACCENT
λ                            
03BB
GREEK SMALL LETTER LAMDA
o ο σ о օ ס ه ٥ ھ ہ ە ۵ 𐐬
006F03BF03C3043E058505E10647066506BE06C106D506F509660A660AE60C020C660C820CE60D020D200D660D820E500ED0101D10401042C
LATIN SMALL LETTER OGREEK SMALL LETTER OMICRONGREEK SMALL LETTER SIGMACYRILLIC SMALL LETTER OARMENIAN SMALL LETTER OHHEBREW LETTER SAMEKHARABIC LETTER HEHARABIC-INDIC DIGIT FIVEARABIC LETTER HEH DOACHASHMEEARABIC LETTER HEH GOALARABIC LETTER AEEXTENDED ARABIC-INDIC DIGIT FIVEDEVANAGARI DIGIT ZEROGURMUKHI DIGIT ZEROGUJARATI DIGIT ZEROTELUGU SIGN ANUSVARATELUGU DIGIT ZEROKANNADA SIGN ANUSVARAKANNADA DIGIT ZEROMALAYALAM SIGN ANUSVARAMALAYALAM LETTER TTHAMALAYALAM DIGIT ZEROSINHALA SIGN ANUSVARAYATHAI DIGIT ZEROLAO DIGIT ZEROMYANMAR LETTER WAMYANMAR DIGIT ZERODESERET SMALL LETTER LONG O
o ο σ о օ ס ه ٥ ھ ہ ە ۵ 𐐬
006F03BF03C3043E058505E10647066506BE06C106D506F509660A660AE60C020C660C820CE60D020D200D660D820E500ED0101D10401042C
LATIN SMALL LETTER OGREEK SMALL LETTER OMICRONGREEK SMALL LETTER SIGMACYRILLIC SMALL LETTER OARMENIAN SMALL LETTER OHHEBREW LETTER SAMEKHARABIC LETTER HEHARABIC-INDIC DIGIT FIVEARABIC LETTER HEH DOACHASHMEEARABIC LETTER HEH GOALARABIC LETTER AEEXTENDED ARABIC-INDIC DIGIT FIVEDEVANAGARI DIGIT ZEROGURMUKHI DIGIT ZEROGUJARATI DIGIT ZEROTELUGU SIGN ANUSVARATELUGU DIGIT ZEROKANNADA SIGN ANUSVARAKANNADA DIGIT ZEROMALAYALAM SIGN ANUSVARAMALAYALAM LETTER TTHAMALAYALAM DIGIT ZEROSINHALA SIGN ANUSVARAYATHAI DIGIT ZEROLAO DIGIT ZEROMYANMAR LETTER WAMYANMAR DIGIT ZERODESERET SMALL LETTER LONG O
. ٠ ۰ ܁ ܂ 𝅭                       
002E066006F0070107021D16D
FULL STOPARABIC-INDIC DIGIT ZEROEXTENDED ARABIC-INDIC DIGIT ZEROSYRIAC SUPRALINEAR FULL STOPSYRIAC SUBLINEAR FULL STOPMUSICAL SYMBOL COMBINING AUGMENTATION DOT
g ƍ ɡ ց                         
0067018D02610581
LATIN SMALL LETTER GLATIN SMALL LETTER TURNED DELTALATIN SMALL LETTER SCRIPT GARMENIAN SMALL LETTER CO
r г                           
00720433
LATIN SMALL LETTER RCYRILLIC SMALL LETTER GHE
ع                            
0639
ARABIC LETTER AIN
ر                            
0631
ARABIC LETTER REH
ب                            
0628
ARABIC LETTER BEH
ى ي ٮ ں ی ے                       
0649064A066E06BA06CC06D2
ARABIC LETTER ALEF MAKSURAARABIC LETTER YEHARABIC LETTER DOTLESS BEHARABIC LETTER NOON GHUNNAARABIC LETTER FARSI YEHARABIC LETTER YEH BARREE
. ٠ ۰ ܁ ܂ 𝅭                       
002E066006F0070107021D16D
FULL STOPARABIC-INDIC DIGIT ZEROEXTENDED ARABIC-INDIC DIGIT ZEROSYRIAC SUPRALINEAR FULL STOPSYRIAC SUBLINEAR FULL STOPMUSICAL SYMBOL COMBINING AUGMENTATION DOT
d ԁ                         
0064050113E7146F
LATIN SMALL LETTER DCYRILLIC SMALL LETTER KOMI DECHEROKEE LETTER TSUCANADIAN SYLLABICS KO
e е ҽ                         
0065043504BD212E
LATIN SMALL LETTER ECYRILLIC SMALL LETTER IECYRILLIC SMALL LETTER ABKHASIAN CHEESTIMATED SYMBOL

Total raw values: 101,964,054,528

Too many raw items to process.


Data

Confusable characters are those that may be confused with others (in some common UI fonts), such as the Latin letter "o" and the Greek letter omicron "ο". Fonts make a difference: for example, the Hebrew character "ס" looks confusingly similar to "o" in some fonts (such as Arial Hebrew), but not in others. See also unaccented Latin Characters..

The data for confusables and restrictions is from UTS39. You can suggest additions or changes to the Unicode data for future versions of that standard.

For more information on the use of the data, see proposed updates Unicode Security Mechanisms and Unicode Security Considerations.

The restrictions are purely on a character level. For a more detailed view, see idna.

Caveats

The Unicode data is designed for testing, not enumerating, so not all combinations are generated in this demo; In particular, where a character is confusable with a sequence, not all combinations are generated.



Fonts and Display. If you don't have a good set of Unicode fonts (and modern browser), you may not be able to read some of the characters. Some suggested fonts that you can add for coverage are: Noto Fonts site, Unicode Fonts for Ancient Scripts, Large, multi-script Unicode fonts. See also: Unicode Display Problems.

Version 3.9; ICU version: 63.1; Unicode version: 11.0; Unicodeβ version: 12.0;