Press Release 4.0

Unicode Extends Chinese, Japanese, Korean (CJK)

Character Database

Version 4.0.1 of the Unicode® Standard Released

Mountain View, CA, March 31, 2004 — The Unicode® Consortium announced today a new update of the Unicode Standard, Version 4.0.1, a significant revision of its Unicode Character Database, widely used in software products. No new characters are added to the standard at this time—the total number of characters still stands at 96,382 for the world's scripts and collections of symbols. However, the information in the Character Database has been refined to improve the quality of text processing in all languages of the world.

This version of the Unicode Character Database includes the first major update of the CJK database (Unihan) in two years. The Unihan Database provides character properties, definitions, pronunciations, mappings, and other information for the CJK characters in the standard—the characters used in particular for Chinese, Japanese, and Korean. This update includes thousands of additions and corrections, including major new correlations with traditional Chinese and Japanese dictionary sources.

This version of the Unicode Standard significantly improves the ability to interchange languages such as Arabic, Hebrew, Urdu, and Pashto. It also clarifies the implementation of such languages as Bengali and the relationship between base form letters and accent marks.

Full technical details regarding the Unicode Standard, Version 4.0.1 are published online at http://www.unicode.org/versions/Unicode4.0.1/ .

The book version of the Unicode Standard, Version 4.0, which Version 4.0.1 amends, was published by Addison-Wesley in September of 2003 (ISBN 0-321-18578-1). For full information, including the online edition and the book order form, see http://www.unicode.org/versions/Unicode4.0.0/ .

About the Unicode Standard

The Unicode Standard provides a uniform architecture and encoding for all languages of the world, with over 95,000 characters currently encoded. Unicode is a fundamental component for providing seamless data interchange around the world, and has been adopted by such industry leaders as Adobe, Apple, HP, IBM, JustSystem, Microsoft, Oracle, SAP, Sun, Sybase, Unisys and many others. Unicode is required by modern standards such as XML, Java, C#, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc., and is the official way to implement ISO/IEC 10646. It is supported in many operating systems, all modern browsers, and many other products. For additional information on Unicode or the Unicode Consortium, please visit http://www.unicode.org.

About the Unicode Consortium

The Unicode Consortium is a non-profit organization founded to develop, extend and promote use of the Unicode Standard, which specifies the representation of text in modern software products and standards. The consortium works very closely with the INCITS L2 committee (http://incits.org/incits/tc_home/l2.htm) and with ISO/IEC JTC 1 SC2.

The Unicode Standard is a major component in the globalization of e-business, as the marketplace continues to demand technologies that enhance seamless data interchange throughout companies' extended -- and often international -- network of suppliers, customers and partners. Unicode is the default text representation in XML, an important open standard being rapidly adopted throughout e-business technology.

The membership of the consortium represents a broad spectrum of corporations and organizations in the computer and information processing industry. Full members (the highest level) are: Adobe Systems, Apple Computer, Basis Technology, Government of India Ministry of Information Technology, Government of Pakistan National Language Authority, HP, IBM, Justsystem, Microsoft, Oracle, PeopleSoft, RLG, SAP, Sun Microsystems, Sybase.

Membership in the Unicode Consortium is open to organizations and individuals anywhere in the world who support the Unicode Standard and wish to assist in its extension and implementation. For additional information on Unicode, please contact the Unicode Consortium (http://www.unicode.org/).