UCD Release

Unicode Character Database 5.0 Released

Mountain View, CA, July 18, 2006 -- The Unicode® Consortium announces the release of a significant update of its widely-used Unicode Character Database (UCD). The new version, Version 5.0, defines more than 99,000 characters for the languages of the world, and provides the detailed properties needed for computer software implementations. This latest level of the UCD contains all the information needed to update software to support the characters and algorithms that are the foundation for all modern computer programs — including the latest data for Unicode security mechanisms, collation, and locales.

For the first time, the Unicode Collation Algorithm (UCA) is released in parallel with the UCD — both UCA Version 5.0 and UCD Version 5.0 are available simultaneously, enabling default collation for all 99,000 characters. For more information on UCA 5.0, see http://www.unicode.org/reports/tr10/.
Implementers are now able to more quickly update their software to fully support minority languages, improved Indic processing, and the newly published subset of most useful Chinese characters for mobile and small applications, IICore.

Version 5.0 of the UCD opens the power of the Common Locale Data Repository (CLDR) Version 1.4 — 360 locales (121 languages and 142 territories) are now supported. The systemization and extension of character properties will enable improved text processing for all CLDR locales.

In this latest version, the UCD data provides dependable caseless matching through stable case folding operations. Version 5.0 data is also the basis for better interoperability for bidirectional scripts (such as Arabic and Hebrew), line breaking, and text segmentation. With the release of this data, implementers can begin to move their software to Version 5.0, in anticipation of the Unicode Version 5.0 support that will ship with many products and libraries, including Windows Vista, ICU, offerings from Google, Yahoo! and many other companies.

The release of Version 5.0 of the UCD is the first step in the release of The Unicode Standard, Version 5.0 — the book (ISBN 0-321-48091-0) will be published by Addison-Wesley in the fourth quarter of 2006. For more information about Unicode 5.0 and the Unicode Character Database, see http://www.unicode.org/versions/Unicode5.0.0/.

The latest features of Unicode Version 5.0 will also be showcased at the 30th Internationalization and Unicode Conference (IUC) on November 17-19, 2006 in Washington, D.C. -- see http://www.unicodeconference.org/.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry: Adobe Systems, L'Agence intergouvernementale de la Francophonie, Apple Computer, Basis Technology, Denic e.G., Google, Government of India - Ministry of Information Technology, Government of Pakistan - National Language Authority, HP, IBM, Justsystem, Microsoft, Monotype Imaging, Oracle, SAP, Sun Microsystems, Sybase, The University of California at Berkeley, Yahoo, plus well over a hundred Associate, Liaison, and Individual members.

For more information, please contact the Unicode Consortium http://www.unicode.org/.