Unicode Releases
Common Locale Data Repository, Version 1.5
Mountain View, CA, July
31,
2007 - The Unicode® Consortium announced today the release of
the new version of the Unicode
Common Locale Data Repository (Unicode CLDR 1.5), providing key
building blocks for software to support the world's languages. Unicode CLDR
is by far the largest and most extensive standard repository of
locale data. This data is used by a wide spectrum of companies for
their software internationalization and localization: adapting
software to the conventions of different languages for such common
software tasks as formatting of dates, times, time zones, numbers,
and currency values; sorting text; choosing languages or countries
by name; transliterating different alphabets; and many others.
CLDR 1.5 contains data for
135 languages and 149 territories: 394 locales in all. Version 1.5
of the repository contains over 42% more locale data than the
previous release, with over 27,000 new or modified data items
entered by over 160 different contributors. New to this release are also
BGN transliterations. Major contributors to
CLDR 1.5 include Adobe, Apple, Google, IBM, Sun, and official
representatives from a number of countries. Many other organizations
and volunteers around the globe have also made important
contributions.
Unicode CLDR 1.5 is part of the Unicode locale data project,
together with Unicode Locale Data
Markup Language (Unicode LDML 1.5). LDML is an XML format used for general interchange of locale data, such as
in Microsoft's .NET. Major features of Unicode LDML 1.5 include
new conformance clauses, commonly used time zone translations, revisions
for handling bidirectional text (Arabic and Hebrew), language fallbacks, revisions for character fallbacks (used for legacy
character encodings),
mappings to related language and country codes, and substantial data on language and
script usage in different countries.
Organizations and volunteers contribute locale data through the
CLDR survey tool. Major improvements to the tool include: enhancements of the
appearance, layout, and operation, substantial new documentation, improved testing, new
levels of approval and corresponding changes to the voting process for lesser-known
languages, and translator forums.
For more information about the Unicode CLDR project (including
charts)
see
http://unicode.org/cldr/.
The latest features of CLDR will also be showcased at the 31st
Internationalization and Unicode Conference (IUC) on October 15-17, 2007 in San
Jose, CA
— see
http://unicodeconference.org/.
About the Unicode Consortium
The Unicode Consortium is a non-profit organization founded to
develop, extend and promote use of the Unicode Standard and related
globalization standards. The membership of the consortium represents
a broad spectrum of corporations and organizations in the computer
and information processing industry: Adobe Systems, Apple, Basis Technology, DENIC eG, Google, Government of India, Government of Pakistan, Government of Tamil Nadu, HP,
IBM, Justsystem, Microsoft, Monotype Imaging, Oracle, SAP, Sun Microsystems, Sybase, UC Berkeley, Yahoo!, plus well over a hundred Associate, Liaison, and
Individual members.