BETA Unicode 4.0.1
The next version of the Unicode Standard will be Version 4.0.1,
due for release in September, 2003. A BETA version of
the updated Unicode Character Database files is available for public comment.
We strongly encourage implementers
to download these files and test them with their programs, well
before the end of the beta period. These files are located in
http://www.unicode.org/Public/Unicode4.0-Update1/
Any comments on the beta Unicode Character Database should be
reported using the Unicode
reporting form. The comment period ends January 27, 2004.
All substantive comments must be received by that date for
consideration at the next UTC meeting. Editorial comments (typos,
etc) may be submitted after that date for consideration in the file
editorial work.
Note: All beta files may be updated, replaced, or
superseded by other files at any time. The beta files will be
discarded once Unicode 4.0.1 is final. It is inappropriate to cite
these files as other than a work in progress.
New Unihan Data
The main focus of the release of the Unicode 4.0.1 update is to
make Unihan.txt available with a large number of fixes and additions
since Unicode 3.2.0 -- fixes that were not available in time to be
released with the Unicode Character Database for Unicode 4.0.0.
Unihan.txt is available in the beta directory as a plain text file,
and also as a gzipped and as a WinZipped file. For beta evaluation,
please download whichever of the zipped versions you can handle, if
possible, to lighten the bandwidth burden of downloading the very
large Unihan.txt uncompressed text file.
Other Updates
Other updates for Unicode 4.0.1 include:
Known Issues
In Unihan.txt, decompositions for some CJK compatibility
characters have not yet been updated to match
Technical Corrigenda #3 and
#4. Some of the compatibility mappings in Unihan.txt need to be
updated, and some Mandarin readings need to be renormalized (in the
kMandarin field). Unihan.txt still needs to have its IRG sources
synchronized with 10646:2003.The "RELEASE NOTES" and "KNOWN ERRORS" sections of Unihan.txt list corrections and known errors.
|