Frequently Asked Questions
The Unicode Frequently Asked Questions (FAQ) are
organized into different topic pages. The list of topic areas is
shown below, along with brief explanations of what kinds of
questions are answered in each topic area.
Many FAQ pages contain links to other pages where you will
find further information about specific topics. Check in particular
the Basic Questions
You may also find it useful to use the
search page and type in
“solidus Frequently Asked Questions”, “BOM Frequently Asked Questions”, or “NFC Frequently Asked Questions”, for example, to locate FAQ entries.
The FAQs are collected from many sources. For more information, see
Click on a topic area to go to that page.
- Discusses the features of Unicode, how it differs from other
encodings, and answers basic support questions such as where to
find additional information on this site.
- Definitions and usage of Unicode blocks and ranges, and questions about blocks versus script values for characters.
Character Properties, Case Mappings and Names
- Answers questions about case conversions and case mappings;
also about character names.
Characters and Combining Marks
- Discusses a variety of details about text elements, combining
characters, compatibility mappings, canonical equivalence...
- Chinese and
- Questions specific to Han ideographs, Chinese and Japanese language handling,
and East Asian fonts.
- CLDR and Locales
- Answers questions about Unicode Locales, CLDR, and LDML.
- Answers to questions of sorting and ordering, Unicode and
- The Unicode compression algorithm (SCSU), LZW, Huffman encoding, and
Conversions / Mappings
- Conversion and mapping to/from other character sets.
- Adapting to changes in the Unicode Standard.
- Display of Unsupported Characters
- Discusses what to do when attempting to display unsupported Unicode characters.
- Emoji and Dingbats
- Discussions of sets of pictorial symbols including Emoji, Dingbats, Webdings and Wingdings, how and why they have been encoded and how to display or implement them.
Entities and Named Sequences
- Discusses named entities and Unicode named character sequences.
- FAQ on FAQs
- Describes how and when new FAQs are created, how FAQs relate
to specifications, and and what to do if you think there is an error in a Unicode specification.
- Where to find more information about fonts. Displaying
characters in Java. Glyph variations. Inputting Chinese and other
- Questions specific to the Greek language, script, and fonts.
- Guide to Abbreviations in Standards
- Lists abbreviations and acronyms used by other standards developing organizations.
- Indic Scripts
and Languages except
- Questions specific to Indic scripts, languages, fonts, and
Internationalization and the Case for Unicode
- Explains the role of Unicode in internationalization of software and answers questions about upgrading software to support Unicode.
Internationalized Domain Names (IDN)
- Provides a series of background explanations about
International Domain names and the different specifications for
them. (Some content preliminary).
- Questions about Hangul and Jamo characters for Korean, and Korean normalization issues.
- Language Tagging
- Plane 14 language tags and language tagging in general.
- Ligatures, Digraphs, Presentation Forms vs. Plain Text
- Can't find a certain digraph or ligature your language needs?
Can you use a particular presentation form?
- Line Breaking
- Questions about how to break text into separate lines for display.
Eastern Scripts and Languages
- Questions about Arabic, Hebrew, and other Middle Eastern
- Questions regarding the various normalization forms, their
use, and where to go for further information.
- Private-Use Characters, Noncharacters, and Sentinels
- Questions about private-use characters and how they are distinguished from noncharacters and sentinels.
- Questions regarding conversion of string handling in old programs, as well as other issues regarding support of
Unicode strings in programs.
Proposed New Characters
- What are the latest proposals? What about my
script? When will the next version of the Unicode Standard be available?
Punctuation and Symbols
- Discusses issues related to punctuation and symbols, including the differences between them.
- Does Unicode pose security problems? What can be done about
such problems as character spoofing?
- Information on where to find specifications or guidelines for
dealing with different programming tasks in the Unicode Standard
and related standards.
- Standards Developing Organizations
- Describes what SDOs are and how the Unicode Consortium works with them. Answers questions about ISO, IETF, W3C, and the terminology they use.
Submitting Successful Character and Script Proposals
- Guidelines on how to write a successful proposal to add new
characters or a new script, or to fix a problem in the standard.
- Tamil Script
- Issues relate to the Tamil language and script
Technical Reports Development Process
- Discusses the development and maintenance process for technical reports, including how they are created and archived.
- Unicode Character Database
- Questions about the Unicode Character Database (UCD).
and ISO 10646
- Relationships between Unicode and ISO working groups, ISO
standards. How Unicode differs from 10646.
and the Web
- Unicode in other standards (W3C, IETF, etc). How do deal
with numeric character references, Unicode in HTML, etc.
- UTF-8, UTF-16, UTF-32 & BOM
- Questions about encoding forms (UTF-8, UTF-16, and UTF-32), definitions of a UTF (Unicode Transformation Format), and use of the byte order mark.
- Variation Sequences
- Answers questions about the meaning, use, and display of variation sequences and selectors.
Direction and BIDI Ordering
- Questions about writing direction, particularly “bidi”
bidirectional left-right and right-left text.