This page lists Public Review Issues numbered 100-176 which have been resolved, in
reverse order by issue number. The
link on the title points to a background document if any is available. For open issues,
please see the
Public Review Issues
page. Older resolved issues are found on
the page: Resolved Issues 1-99.
|
176 |
Properties of Two Khmer Characters
|
2010.10.25 |
The UTC is considering potential changes to the General_Category property values and default
collation weighting of two Khmer characters, U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL INHERENT AA. The UTC is seeking feedback on this topic. In particular, the UTC would be interested in learning of any current implementations which might be adversely affected by any of the proposed modifications to the General_Category and/or default collation weighting of these two characters. Please see the
background document for details on the proposal.
|
Resolution:
Closed 2010-11-08. The two characters will be changed from
format characters to ignorable non-spacing marks for
Unicode 6.1, so that their properties match more closely
the desired collation behavior. UCA 6.1 will also be
updated to make the characters ignorable for collation.
|
|
175 |
CLDR 1.9 Collation Changes
|
2010.10.01 |
The Unicode CLDR committee is making Unicode locale-sensitive collation a major focus for the next release, CLDR 1.9. There are specific changes for a large number of languages, plus a change in the default ordering of punctuation vs symbols for all languages.
See the background document
for more information. If you have any feedback on any of the actions, please file a ticket with CLDR as described there.
|
Resolution: Closed 2010-11-08. The items in PRI #175 were accepted by the CLDR committee, with the following changes:
- Backwards-secondaries were removed from all but fr-CA.
- The code point U+FFFE is also tailored to have a weight lower than all other characters, and disallowing further tailoring of U+FFFE for other collation variants. This allows reliable interleaving of fields in a database, such as "Smith\uFFFEJohn".
- Characters below 'a' are in 5 contiguous groups: space, punct, symbol, currency, digit.
For more information, see
https://www.unicode.org/Public/UCA/6.0.0/CollationAuxiliary.html
|
|
174 |
Proposed Draft UTR #49: Unicode Character Categories
|
2011.01.31 |
This document presents an approach to the categorization of Unicode characters,
and documents a data file that implementers can use for defining Unicode character categories.
Draft update 2010.11.11.
|
Resolution: Closed 2011-02-28. Proposed Draft UTR
#49 will
be advanced to Draft UTR #49. |
|
173 |
Invariant Tests
|
2010.08.02 |
An internal file of machine-readable data is used to test Unicode invariants
for each release of Unicode. This PRI proposes to add that file to the
Unicode Character Database (UCD), making it available for public use. The
data documents what is tested prior to the release of a version of the UCD,
and can also be used for testing implementations, where desired. UAX #44
would be augmented with a short section documenting the structure and usage
based on the header of that file.
We would appreciate any feedback as to whether this file
should be part of the UCD.
The file UnicodeInvariantTest.txt would be included in the UCD.
The file UnicodeTestResults.html would not be included in the UCD,
but is given here for reference. It shows an annotated version of
the UnicodeInvariantTest.txt file, where tables are added showing
the results of assignment statements and test failures,
in this case based on beta data for Unicode 6.0.
Many of the invariants are stability constraints from the Unicode
Stability Policies. Each of those is marked with "Stability" in the preceding
comment. Other invariants are property constraints established by other
standards, such as the Regex properties alpha, alphanum, etc.
Others are "red flag" invariants, which are simply used to detect
when a change in property value might be problematic. Typically those have
a set of exceptions (inclusions or exclusions) that are modified
for each release.
|
Resolution: Closed 2010-08-13. A new draft UTR will
be produced on this topic. |
|
172 |
Proposed Update UTS #46: Unicode IDNA Compatibility Processing
|
2010.09.15 |
The data and text for UTS #46 is being updated to synchronize with Unicode 6.0. In addition, conformance tests are being made available. The files are available in: https://www.unicode.org/Public/idna/6.0.0/
We would like to get feedback on the data tables, and the format and contents of the conformance text. Any suggestions for additional test cases for the conformance tests would be appreciated.
The proposed update text for UTS #46, Version 6.0
is available. The two changes are:
- The addition of a new section describing the conformance tests.
- Addition of two new status values in support of implementations that need to turn the STD3 rules off.
|
Resolution: Closed 2010-11-08. The document will be updated with final content and published for Unicode 6.0. |
|
171 |
Proposal to change properties of U+06DE ARABIC START OF RUB EL HIZB
|
2010.08.02 |
The UTC is considering a proposal to change the properties of
U+-06DE ARABIC START OF RUB EL HIZB from a combining mark to a
spacing symbol, to better match its actual usage. The
background
document contains a full explanation and discussion of the issue.
Public feedback is being sought about this proposed change.
|
Resolution: Closed 2010-08-13. The character
properties will be updated and published as part of Unicode 6.0. |
|
170 |
Unicode 6.0.0 Beta |
2010.08.02 |
The next version of the Unicode Standard will be Version 6.0.0. The beta
information page for Unicode 6.0.0 is located at:
https://www.unicode.org/versions/beta-6.0.0.html
This version is planned for release in September 2010. A beta version of the
6.0.0 Unicode Character Database files is also available for public comment. We
strongly encourage implementers to download these files and test them with their
programs, well before the end of the beta period, August 2, 2010. These files
are located in:
https://www.unicode.org/Public/6.0.0/
For detailed information and guidance on how to focus your review, see the
section Notable Issues for Beta Testers on the beta page.
The Unicode Collation
Algorithm (UCA) will be released in parallel with Unicode 6.0.0, and
a beta version of the UCA is available at
https://www.unicode.org/Public/UCA/6.0.0/.
See also
PRI #166.
The beta information page tells how to report comments and initiate
discussions.
|
Resolution: Closed 2010-08-13. The release was
approved and will be finalized and published. |
|
169 |
Glyph Variation of Double Oblique Hyphen |
2010.08.02 |
Recently, the UTC was presented with evidence that indicates that the DOUBLE OBLIQUE HYPHEN is used in both oblique and horizontal forms. Therefore the committee is considering adding an annotation to U+2E17 DOUBLE OBLIQUE HYPHEN, indicating that it may appear in either an oblique or horizontal form. Public input on the suitability of this is being sought.
|
Resolution: Closed 2010-08-13. An annotation will
not be added. |
|
168 |
Two New Provisional Properties for Characters in Indic Scripts |
2010.05.03 |
The UTC is considering the addition of two new, enumerated
provisional character properties for Indic scripts:
Indic_Syllabic_Category and Matra_Placement. These are to
assist in the analysis and processing of syllables for
various Brahmi-derived scripts, providing classificatory
information that is not easy to extract or derive for all
of the Indic scripts in the standard. Feedback is welcome
on the construction of the proposed properties, the details
of the proposed assignment of values for characters, and
on the question of the usefulness of defining such properties. (Data
updated 2010-04-30)
|
Resolution: Closed 2010-05-21. These properties will be added to Unicode 6.0
as provisional properties, with two data files and adjustments of the
data. |
|
167 |
Ideographic Variation Database Submission
|
2010.06.25 |
The Ideographic Variation Database provides a registry for collections of unique variation sequences containing unified ideographs, allowing for standardized interchange according to
UTS #37, Ideographic Variation Database.. A submission to the Ideographic Variation Database has been received for: "Combined registration of the Hanyo-Densi collection and of sequences in that collection". Details are in the background document.
|
Resolution: Closed 2010-08-13. The submission will
be registered and is pending final update. |
|
166 |
Proposed Update UTS #10: Unicode Collation Algorithm
|
2010.08.02 |
This UTS will be updated to synchronize with Unicode 6.0, and the proposed update
is now open for general public review and comment. The text has been reorganized for
better text flow, and there are significant editorial corrections throughout.
There has also been a major rewrite of the discussion
of "illegal" and "legal" code points. See Sections 7.1.1
and 7.1.2 for details. (Draft updated 2010-07-09)
|
Resolution: Closed 2010-08-13. The document will be
updated with final content and published for Unicode 6.0. |
|
165 |
Proposed Update UAX #42: Unicode Character Database in XML
|
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment. (Draft updated
2010-05-20)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
164 |
Proposed Update UAX #41: Common References for Unicode Standard Annexes
|
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment.
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
163 |
Proposed Update UAX #38: Unicode Han Database (Unihan) |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment.
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
162 |
Proposed Update UAX #34: Unicode Named Character Sequences |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment. (Draft updated
2010-05-21)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
161 |
Proposed Update UAX #31: Unicode Identifier and Pattern Syntax |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment. (Draft updated
2010-05-21)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
160 |
Proposed Update UAX #29: Unicode Text Segmentation |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment.
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
159 |
Proposed Update UAX #24: Unicode Script Property |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment.
Added discussion of multiple script values; added documentation
regarding the new provisional data file ScriptExtensions.txt.
(Draft updated 2010-06-02)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
158 |
Proposed Update UAX #14: Unicode Line Breaking Algorithm |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment. (Draft updated
2010-06-08)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
157 |
Proposed Update UAX #11:
East Asian Width |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment.
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
156 |
Proposed Update UAX #9: Unicode Bidirectional Algorithm |
2010.08.02 |
This UAX will be updated for Unicode 6.0, and the proposed update
is now open for general public review and comment. This revision
contains clarifications around the use of higher-level protocols in
section 4.3. (Draft updated 2010-05-21)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
155 |
Proposed Update UTS #39: Unicode Security Mechanisms |
2010.01.26 |
- The confusable data was revised to add data extracted from a comparison
of font data from Windows and Mac
- Additional mappings were also added, such as "rn" ~ "m"
- The characters recommended for identifiers were updated based on UAX 31
For review of the data and suggesting changes:
- The most useful view of the confusables data is the
confusablesSummary file. This file groups all the confusables
together. Note that the results may vary depending on the font used. Also,
some "unnatural" confusables are added by transitivity (between characters,
or between NFKC_Casefold equivalents).
- The most useful view of the identifier restrictions is the
xidmodifications file.
You can suggest changes with the form at
security-mechanisms.
Draft updated 2010-02-04.
|
Resolution: Closed 2010-02-10. The draft will be
modified according to feedback and advanced to approved UTS. |
|
154 |
Proposed Update UTR #36: Unicode Security Considerations |
2010.01.26 |
This revision adds two new sections: 3.6 Secure Encoding Conversion and 3.7 Enabling Lossless Conversion to Unicode Draft updated 2010-02-04.
|
Resolution: Closed 2010-02-10. The draft will be
modified according to feedback and advanced to approved UTR. |
|
153 |
Proposal to Deprecate Five Character Properties Defined in UAX #44 |
2010.01.26 |
The Unicode Technical Committee is considering the deprecation of the
property
FC_NFKC_Closure. The purpose for which this property was originally
created has been superseded by the NFKC_Casefold property. The UTC is also
considering the deprecation of the four encoding properties
Expands_On_NFC,
Expands_On_NFD, Expands_On_NFKC, and Expands_On_NFKD. Those properties
are easily computed, and do not cover the two most common encoding forms,
UTF-8 and UTF-16. Information on all five of these properties can be found
in the proposed update of
UAX #44:
Unicode Character Database, by following the links above.
Public feedback on this issue is invited.
|
Resolution: Closed 2010-02-10. The 5 properties (FC_NFKC_Closure,
Expands_On_NFC, Expands_On_NFD, Expands_On_NFKC, and Expands_On_NFKD) will
be deprecated in Unicode 6.0. |
|
152 |
Proposed Update UAX #15: Unicode Normalization Forms |
2010.08.02 |
This revision corrects the definitions of classes of
characters in the Composition Exclusion Table and
rewrites Section 11.3, "Guaranteeing Process Stability"
for clarity and correctness. Removed obsolete empty sections, consolidated
small sections, and reordered and renumbered the remaining
sections for better clarity and document flow.(Draft updated 2010-06-25)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
151 |
Proposed Update UAX #44: Unicode Character Database |
2010.08.02 |
This revision indicates the changed status of several properties
as Deprecated, adds tables listing Deprecated and Stabilized
properties, and extends the discussion of the significance of the
Bidi_Mirroring_Glyph property. The Property Summary table has been
moved to the front of Section 5 and renamed the Property Index. A
clarification of the loose matching rule for character names has
been added.
The loose matching rule for symbolic values (UAX44-LM3) has been
extended to account for ignoring any initial prefix strings "is". (Draft updated 2010-07-07)
|
Resolution: Closed 2010-08-13. The UAX will be
updated with final content and published as part of Unicode 6.0. |
|
150 |
Draft UTS #46: Unicode IDNA Compatibility Processing |
2010.01.26 |
This document provides a specification for processing that provides for compatibility between older and newer versions of internationalized domain names (IDN) for lookup in client software. It allows applications such as browsers and emailers to be able to handle both the original version of internationalized domain names (IDNA2003) and the newer version (IDNA2008) compatibly, avoiding possible interoperability and security problems.
(Draft updated 20100-02-04)
|
Resolution: Closed 2010-02-10. The draft UTS will
be modified according to feedback and published as an approved UTS. |
|
149 |
Proposed Update UTS #22: Unicode Character Mapping Markup Language (CharMapML) |
2009.08.03 |
This proposed update includes editorial fixes and clarifications based on
community feedback. There is a small change in the DTD from
version three
to this proposed
version five (a new default attribute value). See the Modification History and the highlighted changes for details. |
Resolution: Closed 2009-08-21. The document will be
updated with changes based on feedback and published. |
|
148 |
Unicode 5.2.0 Beta |
2009.08.03 |
The next version of the Unicode Standard will be Version 5.2.0. The beta
information page for Unicode 5.2.0 is located at:
https://www.unicode.org/versions/beta-5.2.0.html
This version is planned for release in October 2009. A beta version of the
5.2.0 Unicode Character Database files is also available for public comment. We
strongly encourage implementers to download these files and test them with their
programs, well before the end of the beta period, August 3, 2009. These files
are located in:
https://www.unicode.org/Public/5.2.0/
For detailed information and guidance on how to focus your review, see the
section Notable Issues for Beta Testers on the beta page.
The Unicode Collation
Algorithm (UCA) will be released in parallel with Unicode 5.2.0, and
a beta version of the UCA is available at
https://www.unicode.org/Public/UCA/5.2.0/.
See also
PRI #143.
The beta information page tells how to report comments and initiate
discussions.
|
Resolution: Closed 2009-09-24. The UTC has taken account
of all feedback received during beta review and will release Unicode 5.2 early
in October, 2009. |
|
147 |
Proposed Deprecation of U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW |
2009.10.26 |
The UTC has recently approved a proposal to encode an
ARABIC WAVY HAMZA BELOW for a future version of the Unicode
Standard. That character is used productively in Kashmiri
and other languages, and is applied to letters other than
ALEF. The intent is to deprecate the existing character
U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW, in favor
of the sequence of an ALEF plus the new ARABIC WAVY HAMZA
BELOW. (Because of normalization stability constraints,
a canonical equivalence relation cannot be established.)
The UTC is seeking feedback on whether U+0673 should be
deprecated when ARABIC WAVY HAMZA BELOW is encoded. Pertinent
information would include data on how widespread usage of this
character is. Note that deprecation of a character does not
mean removal of that character from the standard; it merely
constitutes a strong recommendation not to use the character.
|
Resolution: Closed 2009-11-13. The character U+0673
will be deprecated in Unicode 6.0, when the new combining character ARABIC
WAVY HAMZA BELOW is encoded. |
|
146 |
Suggested Restructuring of Text in Chapter 3 for
Clarification of Unicode Normalization |
2009.08.03 |
In order to consolidate the formal specification of Unicode
normalization into a single location, text derived from UAX #15
will be incorporated into a rewritten Section 3.11 of the
book text. The details are provided in the background document. This text change does not result in any substantive
change to the definition of Unicode normalization. All Unicode
strings currently in a normalization form will continue to
be in that normalization form. All conformant implementations
of the Unicode Normalization Algorithm will continue to be
conformant. Feedback for this PRI should carefully consider
the closely related PRI #145 which addresses the correlated changes
of text required for UAX #15.
|
Resolution: Closed 2009-09-01. These text changes
will be published as part of Unicode 5.2. |
|
145 |
Proposed
Update UAX #15: Unicode Normalization Forms |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update is
now open for general public review and comment. The main
change for this update of the UAX is the proposed consolidation
of the text for the formal specification of normalization forms
into Chapter 3 of the book text. This means that the entire specification
will be located in one place, instead of being split between two
locations. This text change does not result in any substantive
change to the definition of Unicode normalization. All Unicode
strings currently in a normalization form will continue to
be in that normalization form. All conformant implementations
of the Unicode Normalization Algorithm will continue to be
conformant. Feedback for this PRI should carefully consider
the closely related PRI #146 which addresses the correlated changes
of text for Chapter 3. The draft will be periodically updated
during the development cycle for the release.
Draft updated 2009-06-19.
|
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
144 |
Proposed
Update UAX #42: Unicode Character Database in XML |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
Draft posted 2009-03-13: Added "two-code-points"
as a datatype for code points in the schema and adjusted
several definitions accordingly. |
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
143 |
Proposed
Update UTS #10: Unicode Collation Algorithm |
2009.09.23 |
This UTS will be updated in parallel with Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release. Draft updated
2009-09-24:
-
The text of UTS #10 has been updated. See the modifications
section for details:
https://www.unicode.org/reports/tr10/tr10-19.html#Modifications. Among other changes, the revised text for UTS #10 makes it clear that the BASE for implicit generation of weights for Han characters does not include unassigned code points.
-
The default table contains weights for all newly assigned
characters. See:
https://www.unicode.org/Public/UCA/5.2.0/allkeys-5.2.0.txt.
That directory also contains collation test information that was
current as of a slightly earlier version of allkeys.txt.
Please note the following changes and issues for implementation.
-
There are small changes in Gujarati, Telugu, Malayalam (including
weighting for chillus), Tamil, and Sinhala. While these changes move in
the direction of expected behavior, good results will only come from
tailoring for particular languages, such as with CLDR.
-
There have been significant changes to the ordering of many combining
marks. Many combining marks that are not in customary use in
modern languages now have the same secondary weight, and will
only be distinguished on a fourth level, by code point ordering.
This can be seen by comparing
https://www.unicode.org/charts/collation/chart_Ignorable.html (UCA5.1) with
https://www.unicode.org/charts5.2/collation/chart_Ignorable.html (UCA5.2,
temporary location). Note that in 5.2, many characters have a white
background, indicating that they sort exactly the same as the previous
character, unless a 4th (codepoint) level is used.
-
Implementations of UCA should take note that the increased number of
characters may cause overflows if the implementing code makes
certain assumptions or optimizations. This can result either from
the new character additions (which increase the number of distinct
weights in the table) or because of changes in the way the weights,
particularly for secondary weight values, are assigned in the table.
The latter change may result in unexpected numbers of characters
having the same weight.
|
Resolution: Closed 2009-10-19. The document and associated data file will be published
as Version 5.2 of the Unicode Collation Algorithm. |
|
142 |
Proposed
Update UAX #41: Common References for Unicode Standard Annexes |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
|
Resolution: Closed 2009-08-21. The document will be
updated with feedback and published. |
|
141 |
Proposed
Update UAX #38: Unicode Han Database (Unihan) |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
This latest draft of UAX #38 for the Proposed Update now includes expanded
information about the new "Unihan.zip" archive format, and a revised table
structure for the Unihan Property descriptions.
Draft updated 2009-07-20. |
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
140 |
Proposed
Update UAX #34: Unicode Named Character Sequences |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
|
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
139 |
Proposed
Update UAX #31: Unicode Identifier and Pattern Syntax
|
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
Draft updated 2009-06-22.
|
Resolution: Closed 2009-09-01. The document will be
published as part of Unicode 5.2. |
|
138 |
Proposed
Update UAX #29: Unicode Text Segmentation
|
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
This update changes ZWSP to have the XX (Any) property for word boundary
determination, fixing a problem which was causing words not to break
at ZWSP. It also revises the section relating text boundaries to regular
expressions.
Draft updated 2009-03-31. |
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
137 |
Proposed
Update UAX #24: Unicode Script Property |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
Draft updated 2009-03-02: Section 3 has been substantially rewritten, in particular
to distinguish clearly between script designators and script property value aliases.
|
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
136 |
Proposed
Update UAX #14: Unicode Line Breaking Algorithm
|
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
A new Line_Break class CP has been added, and the rule LB30
has been reintroduced, to address an edge case involving
parentheses. There are numerous other small changes to
the text, both substantive and editorial. Draft updated 2009-07-08.
|
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
135 |
Proposed
Update UAX #11: East Asian Width
|
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release.
Draft updated 2009-03-13: Updated the description of the property value for unassigned codepoints.
|
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
134 |
Proposed
Update UAX #9: Unicode Bidirectional Algorithm |
2009.08.03 |
This UAX will be updated for Unicode 5.2, and the proposed update
is now open for general public review and comment. The draft will
be periodically updated during the development cycle for the release. There are explicit
notes requesting feedback on three open issues whose resolution could have
an effect on how bidi text is displayed. Feedback is welcome on these
issues. For a listing of the changes with links to the affected text, see
https://www.unicode.org/reports/tr9/tr9-20.html#Modifications.
This revision also includes a new conformance test file,
which implementers should carefully review. See BidiTest.txt
in the data files directory:
https://www.unicode.org/Public/5.2.0/ucd/
Draft updated 2009-07-08 |
Resolution: Closed 2009-08-21. The document will be
published as part of Unicode 5.2. |
|
133 |
Proposed Draft UTS #46: Unicode IDNA Compatible Preprocessing |
2009.08.03 |
This Proposed Draft UTS provides a specification for an internationalized domain name
preprocessing step that is intended for use with IDNAbis, the projected update for
Internationalized Domain Names. The proposed specification maintains compatibility with
IDNA2003 (the current version of Internationalized Domain Names), and consistently extends
that mechanism for characters introduced in any later Unicode version.
|
Resolution: Closed 2009-08-21. The draft was
superseded by a new proposed draft. |
|
132 |
Code Point Name/Label Options |
2009.01.26 |
After considering the feedback on
Public Review Issue #129 on Code Point Labels, the UTC discussed several options, which are now being presented for public review and comment. Details are in the background document. |
Resolution: Closed 2009-02-13. The UTC decided to
adopt option D of PRI #132. Changes will be made in the text as documented
in "Changes if we do option C" of the background document,
except for the fourth bullet of section 4.8. That bullet will become an
informative note about API and chart conventions. |
|
131 |
Han Exemplar characters |
2009.01.26 |
The Unicode Locales (CLDR) contains exemplar characters for each locale/language. These are
the characters customarily needed for the language in question. For Han characters, the Unicode CLDR has
been using a fairly small set, but there is a request to include more of the commonly used
characters. There are a number of possible ways to derive this set, and the CLDR technical committee
would like feedback on this. Details are in the background document. |
Resolution: Closed 2009-03-25. Feedback has been
received for release 1.7. |
|
130 |
Word Break Property for ZWSP |
2009.01.26 |
The Unicode Technical Committee is considering changing the Word_Break property value
for ZWSP from the value WB=Format to the value WB=Other (WB=XX).
Details are in the background document. |
Resolution: Closed 2009-02-13. The word break property for ZWSP will be changed to "Other" for Unicode 5.2. |
|
129 |
Code Point Labels: Suggested
Wording Details |
2008.10.27 |
The UTC is seeking input on the proposed text to formally define
Unicode Code Point Labels. Code Point Labels would include
unique strings such as "<reserved-1FF0>" for code points which
have no assigned Unicode character and thus no formal Unicode
character name. Details are in the background document. |
Resolution: Closed 2008-11-14. A new public review
issue has been posted with new wording. Please see:
PRI #132: Code Point Name/Label Options |
|
128 |
Proposed Update UTS #37:
Ideographic Variation Database |
2009.08.03 |
The purpose of this second draft of the Proposed Update is to clarify the
conditions under which a glyphic subset is appropriate for a given base
character, following the UTC discussion. Details are in the
background document. Draft updated 2009-05-21. |
Resolution: Closed 2009-08-21. The report is
approved will be published. |
|
127 |
Proposed Update UAX #44:
Unicode Character Database |
2009.08.03 |
This update is an extensive rewrite of UAX #44 in order to incorporate
all of the former content of UCD.html into the annex
and consolidate all of the documentation in one place.
The material from UCD.html has been reorganized, so that
the documentation is clearer and flows better. Substantial
new content documenting various aspects of character
properties and the UCD has been added as well.
Please review the text carefully for correctness.
The draft was updated on June 15, 2009.
|
Resolution: Closed 2009-09-01. The document will be
published as part of Unicode 5.2. |
|
126 |
Proposed Update UTR #17:
Unicode Character Encoding Model
|
2008.10.27 |
This technical report is being updated to correct the titles for various
references. The model has been resynched to bring it back up to date for Unicode
5.0. The text has also been edited fairly extensively for readability and
consistency. |
Resolution: Closed 2008-11-12. The report is
approved and will be published. |
|
125 |
Proposed Update UTR #33:
Unicode Conformance Model
|
2008.10.27 |
This technical report is being updated to correct the titles for various
references. The text has also been lightly edited. |
Resolution: Closed 2008-11-12. The report is
approved and will be published. |
|
124 |
Proposed Update UTR #23:
The Unicode Character Property Model
|
2008.10.27 |
This proposed update has a new note about constraints on new property additions.
Titles of some references have been updated, along with other minor editing.
Draft updated 2008-08-27. |
Resolution: Closed 2008-11-12. The report is
approved and will be published. |
|
123 |
Bengali Currency Numerator
Values |
2008.10.27 |
The UTC has recently decided to encode some new fraction characters. For the new sets of fraction characters, the UTC
has approved fractional numeric values consistent with the usage of the
characters to represent fractions. However, the current numeric values
associated with historically related Bengali characters U+09F4..U+09F8 are
inconsistent with those numeric value assignments. UTC proposes to update the
numeric values. Details are in the background document. |
Resolution: Closed 2008-11-14. The values will be
changed in accordance with the background document. |
|
122 |
Proposal for Additional Deprecated Characters |
2008.08.04 |
The Unicode Technical Committee is considering giving a number of additional
characters the Deprecated property. See the background
document for details. |
Resolution: Closed 2008-08-29. Changes will be made
to deprecated characters in the next version of the standard. |
|
121 |
Recommended Practice for Replacement Characters |
2008.08.04 |
The Unicode Technical Committee has been requested to specify what the
recommended practice is for replacement characters in converting ill-formed
subsequences. See the review document for further
explanation. |
Resolution: Closed 2008-08-29. The UTC decided to
adopt option 2 of the PRI. |
|
120 |
Draft UTR #45
U-Source Ideographs
|
2008.08.04 |
This new draft UTR #45 describes the
U-source ideographs as used by the IRG in its CJK ideograph
unification work. The draft is posted for public review and
comment.
|
Resolution: Closed 2008-08-29. The draft was
approved with changes based on feedback and will be published. |
|
119 |
Proposed
Update to UTR #25 Unicode Support for Mathematics
|
2008.08.01 |
The proposed update of UTR #25 makes minimal content changes,
mostly consisting of corrections for typographical errors. There
were also some minor formatting changes to enable generation
of the text in pdf format, instead of html, and for better typography
of the mathematical examples.
|
Resolution: Closed 2008-05-21. The proposed update was
approved with changes based on feedback and will be published. |
|
118 |
Proposed
Draft UAX #44 Unicode Character Database |
2008.01.28 |
The Unicode Consortium announces a new proposed draft UAX #44,
Unicode Character Database. This annex consolidates information
documenting the Unicode Character Database. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
117 |
Proposed
Update to UAX #38 The Unicode Han Database (Unihan) |
2008.01.28 |
This document has been changed to be a draft Unicode Standard
Annex. Formerly it was a proposed draft Unicode Technical Report. In this update,
Corrections have been made to some regular expressions. Miscellaneous layout
problems and typographical errors have been corrected.
|
Resolution: Closed 2008-02-11. The draft was
approved and will be published as part of Unicode 5.1.0. |
|
116 |
Proposed
Update to UTS #35 Locale Data Markup Language
|
2007.11.21 |
The Unicode CLDR committee is planning to release a minor version, 1.5.1, by the
end of November. There are a few changes in the specification associated with
this change, notably:
• Added C10. Likely Subtags for locale IDs or language tags
• Added extensive clarifications in Appendix J: Time Zone Display Names |
Resolution: Closed 2008-02-11. Version 1.5.1 was
released. |
|
115 |
Proposed
Update to UTR #36 Unicode Security Considerations
|
2008.01.28 |
Changes in this proposed update include:
• Added explanation of UTF-8 over consumption attack
in section 3.1 UTF-8 Exploits
• Added subsection of 2.8.2 Mapping and Prohibition
describing the Unicode 5.1 changes in identifiers
• Added section 3.4 Property and Character Stability |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published. |
|
114 |
Proposed
Update to UAX #34 Unicode Named Character Sequences |
2008.01.28 |
There have been no internal text changes to UAX #34, this update is the
pro-forma version release candidate update for Unicode 5.1.0. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
113 |
Proposed
Update to UTS #10 Unicode Collation Algorithm
|
2008.01.28 |
This update clarifies the use of contractions in DUCET. Information has been
added about the use of parameterization (section 5.1), and a new conformance
clause (C6). |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published in the
timeframe of Unicode 5.1.0. |
|
112 |
Proposed
Update to UAX #9 Unicode Bidirectional Algorithm |
2008.01.28 |
In this update, definition BD6 has been clarified. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
111 |
Proposed
Update to UTS #18 Unicode Regular Expressions
|
2008.08.04 |
The proposed update of UAX #18 clarifies conformance requirements for "."
and CRLF, updates the syntax, incorporates the new extended grapheme
clusters for Unicode 5.1, and better describes dealing with normalization,
the importance of levels, and the use of wildcards in property values.
Public feedback is invited.
|
Resolution: Closed 2008-08-29. The draft was
approved with changes based on feedback and will be published. |
|
110 |
Proposed
Update to UAX #24 Script Names |
2008.01.28 |
The proposed update of UAX #24 adds a new section regarding use
of the script property in rendering systems, clarifies issues of
script inheritance in combining character sequences, and documents
the script anomalies for some East Asian squared abbreviation
compatibility symbols. Public feedback is invited.
|
Resolution: Closed 2008-02-11. The draft was
approved and will be published as part of Unicode 5.1.0. |
|
109 |
Proposed Draft
UAX #42: Unicode Character Database in XML |
2008.01.28 |
This draft UAX describes an XML
representation of the Unicode Character Database, and is available for
public review and comment. Please see the separate
background document for details of this review and how to obtain
data files. Review document updated 2007-11-14.
|
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
108 |
Ideographic
Variation Database Submission |
2007.11.25 |
The Ideographic Variation Database provides a
registry for collections of unique variation sequences containing
unified ideographs, allowing for standardized interchange according to
UTS#37, Ideographic
Variation Database. An updated submission to the Ideographic
Variation Database has been received for: "Combined registration of the
Adobe-Japan1 collection and of sequences in that collection". Details
are in the background
document. |
Resolution: Closed 2008-01-23. The submission was
accepted with minor modifications and was incorporated in version
2007-12-14 of the Ideographic Variation Database. |
|
107 |
Script Property Values for some characters
in U+3200..U+33FF |
2007.07.30 |
UTC is seeking public feedback on whether to change the value of the Script
property for various characters in the block U+3200..U+33FF. |
Resolution: Closed 2007-08-15. No script property
changes were made as a result of this public review issue. |
|
106 |
Proposed Update to UAX #11:
East Asian Width |
2007.05.08 |
This proposed update adds a note on the lack of canonical
equivalence for the assignment of the EAW=ambiguous property to characters,
and clarifies the status of such characters at several points in the text.
|
Resolution: Closed 2007-05-29. The draft will be
updated and posted with 5.1.0. |
|
105 |
Proposed Update to UAX #14: Line Breaking Properties |
2008.01.28 |
This proposed update for UAX#14 updates the description of linebreak
classes with the line break properties in the beta version of the
Unicode Character Database, version 5.0.1. The rules were updated to
support the sequence <SHY, NBHY> for languages such as Polish and
Portuguese. The conformance clause was updated to propose additional
language on permissible higher level protocols. The entire text has been
reviewed, and improved in a number of places, to make it easier to
normatively reference this UAX from other specifications. Owners of
other specifications (higher level protocols) are particularly
encouraged to review this proposed update. Note: the line breaking rules
for Ethiopic are under separate investigation. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
104 |
Proposed Update to UAX #31:
Identifier and Pattern Syntax |
2008.01.28 |
The proposed update of UAX #31 has changes that
discuss the issue of canonical equivalence of identifiers. Public
feedback is invited. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
103 |
Proposed Update to UAX #29:
Text Boundaries |
2008.01.28 |
The proposed update to UAX #29 fixes some items that
were noted in proof for Unicode 5.0. It makes changes in the
definition of "Sp" and and some break conditions in rules SB8 and
SB11. Public feedback is invited. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
102 |
Proposed Update to UAX #15: Unicode Normalization Forms |
2008.01.28 |
There is a Proposed Update to UAX #15, which
specifies a new Normalization Process for Stabilized Strings. The key
concept is that for a given normalization form, once a Unicode string has
been successfully normalized according to that process, it will never
change if subsequently normalized again, in any version of Unicode, past or
future. This definition depends on an anticipated further tightening of
the Unicode Stability Policies such that normalization of assigned
characters will not change in future versions of Unicode. Details are in the proposed update itself. |
Resolution: Closed 2008-02-11. The draft was
approved with changes based on feedback and will be published as part of
Unicode 5.1.0. |
|
101 |
Proposal to
Encode an External Link Sign |
2007.05.08 |
The UTC has received a proposal to encode an EXTERNAL LINK SIGN
as a character. The proposed symbol marks external links within web pages (i.e.
links which lead to another site, contrary to internal links which
lead to another page within the same site or domain). The submitted
proposal itself is available for review, and
some detailed questions for reviewers are presented
in the background document. |
Resolution: Closed 2007-05-29. This issue will be
taken up by the symbols subcommittee to make recommendations in the
context of their discussion of other symbols. |
|
100 |
Giving U+00B7 MIDDLE DOT the ID_Continue Property |
2007.01.30 |
The character U+00B7 MIDDLE DOT has the XID_Continue property, but not the ID_Continue property. It is the only character of this sort. The UTC is considering removing this exception, thus making the set of
XID_Continue characters a proper subset of ID_Continue characters. This
would ensure that all valid identifiers defined using the XID_Start and
XID_Continue properties would also be valid identifiers based on the
ID_Start and ID_Continue properties.
The XID_Start and XID_Continue properties are improved lexical classes
that incorporate the changes described in Section 5.1, NFKC Modifications
of UAX #31. They are
recommended for most purposes, especially for security, over the original
ID_Start and ID_Continue properties.
In light of the use of MIDDLE DOT in encoding Catalan data, the UTC is
soliciting feedback on this proposed property change for U+00B7 MIDDLE DOT.
|
Resolution: Closed 2007-02-15. The character U+00B7
MIDDLE DOT will be added to ID_Continue in the next version of the
standard. |
|