Unicode Releases Common Locale Data Repository, Version 1.4

Mountain View, CA, July 17, 2006 - The Unicode® Consortium announced today the release of the new version of the Unicode Common Locale Data Repository (CLDR 1.4), providing key building blocks for software to support the world's languages. CLDR is by far the largest and most extensive standard repository of locale data. This data is used by a wide spectrum of companies for their software internationalization and localization: adapting software to the conventions of different languages for such common software tasks as formatting of dates, times, time zones, numbers, and currency values; sorting text; choosing languages or countries by name; and many others.

This release of CLDR contains data for 121 languages and 142 territories -- 360 locales in all. Version 1.4 of the repository contains over 25% more locale data than the previous release, with over 17,000 new or modified data items entered by over 100 different contributors. Major contributors to CLDR 1.4 include Apple, Google, IBM, and Sun, plus official representatives from a number of countries. Many other organizations and individuals around the globe have also made important contributions.

CLDR 1.4 uses the XML format provided by the newest version of the Locale Data Markup Language (LDML 1.4). LDML is a format used not only for CLDR, but also for general interchange of locale data, such as in Microsoft's .NET. Some of the major features of LDML 1.4 used in the repository include new XML structures supporting customizable detection of words, lines, and sentences (segmentation), transliteration between different alphabets, and full compatibility with the recently approved internet standards for language tags. It also supports enhanced formats for dates and times, and adds new guidelines for date, time, and number parsing.

For more information about the CLDR project, see http://www.unicode.org/cldr/. The latest features of CLDR will also be showcased at the 30^th Internationalization and Unicode Conference (IUC) on November 17-19, 2006 in Washington, D.C. -- see http://www.unicodeconference.org/.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard and related globalization standards. The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry: Adobe Systems, L'Agence intergouvernementale de la Francophonie, Apple Computer, Basis Technology, Denic e.G., Google, Government of India - Ministry of Information Technology, Government of Pakistan - National Language Authority, HP, IBM, Justsystem, Microsoft, Monotype Imaging, Oracle, SAP, Sun Microsystems, Sybase, The University of California at Berkeley, Yahoo, plus well over a hundred Associate, Liaison, and Individual members.