L2/01-169
Title: Normative Changes in Unicode 3.1
Distribution: Unicode Liaisons
Date: April 17, 2001
Source: Lisa Moore
Unicode Technical Committee
The
Unicode Standard 3.1
was published on March 30, 2001, and can be found on the Unicode web site at: http://www.unicode.org/unicode/reports/tr27
Listed here
for your convenience is a summary of the normative changes in this revision of
the Standard. However, we would still urge you to review the entire document
for any changes relevant to your organization.
Normative changes in Unicode 3.1:
New
character allocations:
Unicode 3.1 adds 44,946 encoded characters:
42,711 new ideographs, many additional symbols, historical scripts, and tag
characters.
Supplementary
characters:
Characters are now encoded beyond the original 16-bit codespace (BMP).
Amended
data files: Data
files have been updated to account for the new repertoire of characters.
UTF-32: Unicode now has three sanctioned
encoding forms: UTF-8, UTF-16, and UTF-32.
Noncharacters: Thirty-two more noncharacters have
been added and the status of all sixty-six noncharacters has been clarified.
UTF-8
corrigendum: The
definition of UTF-8 has been corrected for a security issue: conformant
implementations now cannot interpret non-shortest forms for BMP characters.
Special
character properties:
Music format control characters were added to the list of special character
properties.
UAX#15
Unicode Normalization Forms: U+FB1D YOD WITH HIRIQ has
been added to the Composition Exclusion List.
New
normative properties:
All of the General Category values plus the case mappings in UnicodeData.txt
and SpecialCasing.txt are now normative; Cn is the default value for the
general category for all unassigned code points and noncharacters.
Normative
references: The
use of normative references to Unicode properties by other specifications was
clarified.