[Unicode]  Unihan Database Home | Site Map | Search
 

Unihan Database Lookup

 

 

About the Unihan Database Lookup Tool

As a handy reference, the Unicode Consortium here provides a search interface to the Unicode Hàn (漢) Database (Unihan).

The Unihan Database organizes information relating to the properties of CJK Unified Ideographs. Unihan Database Documentation is available in UAX #38.

For production reasons, the version of the Unihan database available via this lookup tool may not yet be in sync with the latest version of the Unicode Standard. For access to the most recent version of the raw data files (Unihan.zip), see http://www.unicode.org/Public/UCD/latest/.

The lookup interface on this page provides access to Unihan information on individual characters through the “Lookup” button and text field above. Enter the four- or five-digit hexadecimal identifier for the character (if you know it), or copy and paste a character (if you have one), and then click the “Lookup” button.

The resulting data set will contain various types of information available in the Unihan database, for example, mappings to legacy encoding standards, references to major dictionaries, and meaning and pronunciation information according to various authorities.

The Use images, not text check-box (in Radical-stroke index pages) controls whether CJK ideographs in that index are displayed as text (reliant upon your CJK system fonts, if any) or as embedded images (if available). Using images is relatively system-independent, but the images are fixed-size and loading time may be longer. If you choose to use images, query results will display CJK ideographs as, for example, HAN (if an image is available) rather than as 漢. For production reasons, some images may not yet be available.

If you do not happen to know the hexadecimal value of the code point, and have no example of the character in text to copy and paste, another Unihan Search Page supports queries on a few select fields (for example key-word and pronunciation fields).

There are also two indices for the database: a grid index grouping the characters in blocks of 256; and a radical-stroke index.

Unihan Code Charts and Indices

See The Unicode Standard, Chapter 12 (PDF) for discussion of Han (CJKV) unification principles. The Unihan Radical-Stroke indices are documented in a short PDF file. The indices are available online in two PDF files, the Full RS Index and the II Core RS Index. Code Charts covering all of Unihan are available in PDF format, linked from the main chart index page along with other code charts.

Disclaimers

The Unihan database is provided as a public service by Unicode, Inc. These data are provided as-is by Unicode, Inc. (The Unicode Consortium). No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided.

The data in the Unihan database derives from various sources, as documented in UAX 38. Some data available via the Unihan Database Lookup tool may be accessed via links to other web sites.