[Unicode]  General Information Home | Site Map | Search
 

What is Unicode?

Unicode provides a unique number for every character,
no matter what the platform,
no matter what the program,
no matter what the language.

    Unicode Book Cover

Characters Before Unicode

Fundamentally, computers just deal with numbers. They store letters and other characters by assigning a number for each one. Before Unicode was invented, there were hundreds of different systems, called character encodings, for assigning these numbers. These early character encodings were limited and could not contain enough characters to cover all the world's languages. Even for a single language like English no single encoding was adequate for all the letters, punctuation, and technical symbols in common use.

Early character encodings also conflicted with one another. That is, two encodings could use the same number for two different characters, or use different numbers for the same character. Any given computer (especially servers) would need to support many different encodings. However, when data is passed through different computers or between different encodings, that data runs the risk of corruption.

Unicode Characters

Unicode has changed all that!

The Unicode Standard provides a unique number for every character, no matter what platform, device, application or language. It has been adopted by all modern software providers and now allows data to be transported through many different platforms, devices and applications without corruption. Support of Unicode forms the foundation for the representation of languages and symbols in all major operating systems, search engines, browsers, laptops, and smart phones—plus the Internet and World Wide Web (URLs, HTML, XML, CSS, JSON, etc.). Supporting Unicode is the best way to implement ISO/IEC 10646.

The emergence of the Unicode Standard and the availability of tools supporting it are among the most significant recent global software technology trends.

About the Unicode Consortium

The Unicode Consortium is a non-profit, 501(c)(3) organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards which specify the representation of text in modern software products and other standards. The Consortium is supported financially through membership dues and donations. Membership in the Unicode Consortium is open to organizations and individuals anywhere in the world who support the Unicode Standard and wish to assist in its extension and implementation. All are invited to contribute to the support of the Consortium's important work by making a donation.

For more information, see the Unicode Standard, our FAQ, and the list of Unicode members.

Looking for Translations?

This page used to feature a series of translations in many different languages and scripts, in part to highlight the scope and use of the Unicode Standard. However, the original text content of the page needed updating, and managing the update of all of the separate translations to match was not feasible. For archival purposes, the old text of the What is Unicode? page and the numerous original translations linked from that page are still available. However, please use that text with caution, because it is outdated.

These days, Unicode implementations are so widespread that it is easy to find examples online in many languages and scripts. In particular, consulting any page in the Wikipedia will immediately let you click through to similar pages in other languages and writing systems, actively maintained by the large Wikipedia community of editors. There are millions of articles in the Wikipedia, all using the Unicode Standard for the representation of text.