Resolved Public Review Issues

Public Review Issues

Tech Site | Site Map | Search

Resolved Public Review Issues 100-176

This page lists Public Review Issues numbered 100-176 which have been resolved, in reverse order by issue number. The link on the title points to a background document if any is available. For open issues, please see the Public Review Issues page. Older resolved issues are found on the page: Resolved Issues 1-99.

176 Properties of Two Khmer Characters 2010.10.25

The UTC is considering potential changes to the General_Category property values and default collation weighting of two Khmer characters, U+17B4 KHMER VOWEL INHERENT AQ and U+17B5 KHMER VOWEL INHERENT AA. The UTC is seeking feedback on this topic. In particular, the UTC would be interested in learning of any current implementations which might be adversely affected by any of the proposed modifications to the General_Category and/or default collation weighting of these two characters. Please see the background document for details on the proposal.

Resolution: Closed 2010-11-08. The two characters will be changed from format characters to ignorable non-spacing marks for Unicode 6.1, so that their properties match more closely the desired collation behavior. UCA 6.1 will also be updated to make the characters ignorable for collation.

175 CLDR 1.9 Collation Changes 2010.10.01

The Unicode CLDR committee is making Unicode locale-sensitive collation a major focus for the next release, CLDR 1.9. There are specific changes for a large number of languages, plus a change in the default ordering of punctuation vs symbols for all languages.

See the background document for more information. If you have any feedback on any of the actions, please file a ticket with CLDR as described there.

Resolution: Closed 2010-11-08. The items in PRI #175 were accepted by the CLDR committee, with the following changes:

Backwards-secondaries were removed from all but fr-CA.

The code point U+FFFE is also tailored to have a weight lower than all other characters, and disallowing further tailoring of U+FFFE for other collation variants. This allows reliable interleaving of fields in a database, such as "Smith\uFFFEJohn".

Characters below 'a' are in 5 contiguous groups: space, punct, symbol, currency, digit.

For more information, see https://www.unicode.org/Public/UCA/6.0.0/CollationAuxiliary.html

174 Proposed Draft UTR #49: Unicode Character Categories 2011.01.31

This document presents an approach to the categorization of Unicode characters, and documents a data file that implementers can use for defining Unicode character categories. Draft update 2010.11.11.

Resolution: Closed 2011-02-28. Proposed Draft UTR #49 will be advanced to Draft UTR #49.

173 Invariant Tests 2010.08.02

An internal file of machine-readable data is used to test Unicode invariants for each release of Unicode. This PRI proposes to add that file to the Unicode Character Database (UCD), making it available for public use. The data documents what is tested prior to the release of a version of the UCD, and can also be used for testing implementations, where desired. UAX #44 would be augmented with a short section documenting the structure and usage based on the header of that file.

We would appreciate any feedback as to whether this file should be part of the UCD.

The file UnicodeInvariantTest.txt would be included in the UCD. The file UnicodeTestResults.html would not be included in the UCD, but is given here for reference. It shows an annotated version of the UnicodeInvariantTest.txt file, where tables are added showing the results of assignment statements and test failures, in this case based on beta data for Unicode 6.0.

Many of the invariants are stability constraints from the Unicode Stability Policies. Each of those is marked with "Stability" in the preceding comment. Other invariants are property constraints established by other standards, such as the Regex properties alpha, alphanum, etc. Others are "red flag" invariants, which are simply used to detect when a change in property value might be problematic. Typically those have a set of exceptions (inclusions or exclusions) that are modified for each release.

Resolution: Closed 2010-08-13. A new draft UTR will be produced on this topic.

172 Proposed Update UTS #46: Unicode IDNA Compatibility Processing 2010.09.15

The data and text for UTS #46 is being updated to synchronize with Unicode 6.0. In addition, conformance tests are being made available. The files are available in: https://www.unicode.org/Public/idna/6.0.0/

We would like to get feedback on the data tables, and the format and contents of the conformance text. Any suggestions for additional test cases for the conformance tests would be appreciated.

The proposed update text for UTS #46, Version 6.0 is available. The two changes are:

The addition of a new section describing the conformance tests.

Addition of two new status values in support of implementations that need to turn the STD3 rules off.

Resolution: Closed 2010-11-08. The document will be updated with final content and published for Unicode 6.0.

171 Proposal to change properties of U+06DE ARABIC START OF RUB EL HIZB 2010.08.02

The UTC is considering a proposal to change the properties of U+-06DE ARABIC START OF RUB EL HIZB from a combining mark to a spacing symbol, to better match its actual usage. The background document contains a full explanation and discussion of the issue. Public feedback is being sought about this proposed change.

Resolution: Closed 2010-08-13. The character properties will be updated and published as part of Unicode 6.0.

170 Unicode 6.0.0 Beta 2010.08.02

The next version of the Unicode Standard will be Version 6.0.0. The beta information page for Unicode 6.0.0 is located at:

https://www.unicode.org/versions/beta-6.0.0.html

This version is planned for release in September 2010. A beta version of the 6.0.0 Unicode Character Database files is also available for public comment. We strongly encourage implementers to download these files and test them with their programs, well before the end of the beta period, August 2, 2010. These files are located in:

https://www.unicode.org/Public/6.0.0/

For detailed information and guidance on how to focus your review, see the section Notable Issues for Beta Testers on the beta page.

The Unicode Collation Algorithm (UCA) will be released in parallel with Unicode 6.0.0, and a beta version of the UCA is available at https://www.unicode.org/Public/UCA/6.0.0/. See also PRI #166.

The beta information page tells how to report comments and initiate discussions.

Resolution: Closed 2010-08-13. The release was approved and will be finalized and published.

169 Glyph Variation of Double Oblique Hyphen 2010.08.02

Recently, the UTC was presented with evidence that indicates that the DOUBLE OBLIQUE HYPHEN is used in both oblique and horizontal forms. Therefore the committee is considering adding an annotation to U+2E17 DOUBLE OBLIQUE HYPHEN, indicating that it may appear in either an oblique or horizontal form. Public input on the suitability of this is being sought.

Resolution: Closed 2010-08-13. An annotation will not be added.

168 Two New Provisional Properties for Characters in Indic Scripts 2010.05.03

The UTC is considering the addition of two new, enumerated provisional character properties for Indic scripts: Indic_Syllabic_Category and Matra_Placement. These are to assist in the analysis and processing of syllables for various Brahmi-derived scripts, providing classificatory information that is not easy to extract or derive for all of the Indic scripts in the standard. Feedback is welcome on the construction of the proposed properties, the details of the proposed assignment of values for characters, and on the question of the usefulness of defining such properties. (Data updated 2010-04-30)

Resolution: Closed 2010-05-21. These properties will be added to Unicode 6.0 as provisional properties, with two data files and adjustments of the data.

167 Ideographic Variation Database Submission 2010.06.25

The Ideographic Variation Database provides a registry for collections of unique variation sequences containing unified ideographs, allowing for standardized interchange according to UTS #37, Ideographic Variation Database.. A submission to the Ideographic Variation Database has been received for: "Combined registration of the Hanyo-Densi collection and of sequences in that collection". Details are in the background document.

Resolution: Closed 2010-08-13. The submission will be registered and is pending final update.

166 Proposed Update UTS #10: Unicode Collation Algorithm 2010.08.02

This UTS will be updated to synchronize with Unicode 6.0, and the proposed update is now open for general public review and comment. The text has been reorganized for better text flow, and there are significant editorial corrections throughout. There has also been a major rewrite of the discussion of "illegal" and "legal" code points. See Sections 7.1.1 and 7.1.2 for details. (Draft updated 2010-07-09)

Resolution: Closed 2010-08-13. The document will be updated with final content and published for Unicode 6.0.

165 Proposed Update UAX #42: Unicode Character Database in XML 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment. (Draft updated 2010-05-20)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

164 Proposed Update UAX #41: Common References for Unicode Standard Annexes 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment.

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

163 Proposed Update UAX #38: Unicode Han Database (Unihan) 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment.

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

162 Proposed Update UAX #34: Unicode Named Character Sequences 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment. (Draft updated 2010-05-21)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

161 Proposed Update UAX #31: Unicode Identifier and Pattern Syntax 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment. (Draft updated 2010-05-21)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

160 Proposed Update UAX #29: Unicode Text Segmentation 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment.

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

159 Proposed Update UAX #24: Unicode Script Property 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment. Added discussion of multiple script values; added documentation regarding the new provisional data file ScriptExtensions.txt. (Draft updated 2010-06-02)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

158 Proposed Update UAX #14: Unicode Line Breaking Algorithm 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment. (Draft updated 2010-06-08)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

157 Proposed Update UAX #11: East Asian Width 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment.

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

156 Proposed Update UAX #9: Unicode Bidirectional Algorithm 2010.08.02

This UAX will be updated for Unicode 6.0, and the proposed update is now open for general public review and comment. This revision contains clarifications around the use of higher-level protocols in section 4.3. (Draft updated 2010-05-21)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

155 Proposed Update UTS #39: Unicode Security Mechanisms 2010.01.26

The confusable data was revised to add data extracted from a comparison of font data from Windows and Mac

Additional mappings were also added, such as "rn" ~ "m"

The characters recommended for identifiers were updated based on UAX 31

For review of the data and suggesting changes:

The most useful view of the confusables data is the confusablesSummary file. This file groups all the confusables together. Note that the results may vary depending on the font used. Also, some "unnatural" confusables are added by transitivity (between characters, or between NFKC_Casefold equivalents).

The most useful view of the identifier restrictions is the xidmodifications file.

~~You can suggest changes with the form at security-mechanisms~~.

Draft updated 2010-02-04.

Resolution: Closed 2010-02-10. The draft will be modified according to feedback and advanced to approved UTS.

154 Proposed Update UTR #36: Unicode Security Considerations 2010.01.26

This revision adds two new sections: 3.6 Secure Encoding Conversion and 3.7 Enabling Lossless Conversion to Unicode
Draft updated 2010-02-04.

Resolution: Closed 2010-02-10. The draft will be modified according to feedback and advanced to approved UTR.

153 Proposal to Deprecate Five Character Properties Defined in UAX #44 2010.01.26

The Unicode Technical Committee is considering the deprecation of the property FC_NFKC_Closure. The purpose for which this property was originally created has been superseded by the NFKC_Casefold property. The UTC is also considering the deprecation of the four encoding properties Expands_On_NFC, Expands_On_NFD, Expands_On_NFKC, and Expands_On_NFKD. Those properties are easily computed, and do not cover the two most common encoding forms, UTF-8 and UTF-16. Information on all five of these properties can be found in the proposed update of UAX #44: Unicode Character Database, by following the links above. Public feedback on this issue is invited.

Resolution: Closed 2010-02-10. The 5 properties (FC_NFKC_Closure, Expands_On_NFC, Expands_On_NFD, Expands_On_NFKC, and Expands_On_NFKD) will be deprecated in Unicode 6.0.

152 Proposed Update UAX #15: Unicode Normalization Forms 2010.08.02

This revision corrects the definitions of classes of characters in the Composition Exclusion Table and rewrites Section 11.3, "Guaranteeing Process Stability" for clarity and correctness. Removed obsolete empty sections, consolidated small sections, and reordered and renumbered the remaining sections for better clarity and document flow.(Draft updated 2010-06-25)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

151 Proposed Update UAX #44: Unicode Character Database 2010.08.02

This revision indicates the changed status of several properties as Deprecated, adds tables listing Deprecated and Stabilized properties, and extends the discussion of the significance of the Bidi_Mirroring_Glyph property. The Property Summary table has been moved to the front of Section 5 and renamed the Property Index. A clarification of the loose matching rule for character names has been added. The loose matching rule for symbolic values (UAX44-LM3) has been extended to account for ignoring any initial prefix strings "is". (Draft updated 2010-07-07)

Resolution: Closed 2010-08-13. The UAX will be updated with final content and published as part of Unicode 6.0.

150 Draft UTS #46: Unicode IDNA Compatibility Processing 2010.01.26

This document provides a specification for processing that provides for compatibility between older and newer versions of internationalized domain names (IDN) for lookup in client software. It allows applications such as browsers and emailers to be able to handle both the original version of internationalized domain names (IDNA2003) and the newer version (IDNA2008) compatibly, avoiding possible interoperability and security problems. (Draft updated 20100-02-04)

Resolution: Closed 2010-02-10. The draft UTS will be modified according to feedback and published as an approved UTS.

149 Proposed Update UTS #22: Unicode Character Mapping Markup Language (CharMapML) 2009.08.03

This proposed update includes editorial fixes and clarifications based on community feedback. There is a small change in the DTD from version three to this proposed version five (a new default attribute value). See the Modification History and the highlighted changes for details.

Resolution: Closed 2009-08-21. The document will be updated with changes based on feedback and published.

148 Unicode 5.2.0 Beta 2009.08.03

The next version of the Unicode Standard will be Version 5.2.0. The beta information page for Unicode 5.2.0 is located at:

https://www.unicode.org/versions/beta-5.2.0.html

This version is planned for release in October 2009. A beta version of the 5.2.0 Unicode Character Database files is also available for public comment. We strongly encourage implementers to download these files and test them with their programs, well before the end of the beta period, August 3, 2009. These files are located in:

https://www.unicode.org/Public/5.2.0/

For detailed information and guidance on how to focus your review, see the section Notable Issues for Beta Testers on the beta page.

The Unicode Collation Algorithm (UCA) will be released in parallel with Unicode 5.2.0, and a beta version of the UCA is available at https://www.unicode.org/Public/UCA/5.2.0/. See also PRI #143.

The beta information page tells how to report comments and initiate discussions.

Resolution: Closed 2009-09-24. The UTC has taken account of all feedback received during beta review and will release Unicode 5.2 early in October, 2009.

147 Proposed Deprecation of U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW 2009.10.26

The UTC has recently approved a proposal to encode an ARABIC WAVY HAMZA BELOW for a future version of the Unicode Standard. That character is used productively in Kashmiri and other languages, and is applied to letters other than ALEF. The intent is to deprecate the existing character U+0673 ARABIC LETTER ALEF WITH WAVY HAMZA BELOW, in favor of the sequence of an ALEF plus the new ARABIC WAVY HAMZA BELOW. (Because of normalization stability constraints, a canonical equivalence relation cannot be established.)

The UTC is seeking feedback on whether U+0673 should be deprecated when ARABIC WAVY HAMZA BELOW is encoded. Pertinent information would include data on how widespread usage of this character is. Note that deprecation of a character does not mean removal of that character from the standard; it merely constitutes a strong recommendation not to use the character.

Resolution: Closed 2009-11-13. The character U+0673 will be deprecated in Unicode 6.0, when the new combining character ARABIC WAVY HAMZA BELOW is encoded.

146 Suggested Restructuring of Text in Chapter 3 for Clarification of Unicode Normalization 2009.08.03

In order to consolidate the formal specification of Unicode normalization into a single location, text derived from UAX #15 will be incorporated into a rewritten Section 3.11 of the book text. The details are provided in the background document. This text change does not result in any substantive change to the definition of Unicode normalization. All Unicode strings currently in a normalization form will continue to be in that normalization form. All conformant implementations of the Unicode Normalization Algorithm will continue to be conformant. Feedback for this PRI should carefully consider the closely related PRI #145 which addresses the correlated changes of text required for UAX #15.

Resolution: Closed 2009-09-01. These text changes will be published as part of Unicode 5.2.

145 Proposed Update UAX #15: Unicode Normalization Forms 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The main change for this update of the UAX is the proposed consolidation of the text for the formal specification of normalization forms into Chapter 3 of the book text. This means that the entire specification will be located in one place, instead of being split between two locations. This text change does not result in any substantive change to the definition of Unicode normalization. All Unicode strings currently in a normalization form will continue to be in that normalization form. All conformant implementations of the Unicode Normalization Algorithm will continue to be conformant. Feedback for this PRI should carefully consider the closely related PRI #146 which addresses the correlated changes of text for Chapter 3. The draft will be periodically updated during the development cycle for the release. Draft updated 2009-06-19.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

144 Proposed Update UAX #42: Unicode Character Database in XML 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. Draft posted 2009-03-13: Added "two-code-points" as a datatype for code points in the schema and adjusted several definitions accordingly.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

143 Proposed Update UTS #10: Unicode Collation Algorithm 2009.09.23

This UTS will be updated in parallel with Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release.
Draft updated 2009-09-24:

The text of UTS #10 has been updated. See the modifications section for details: https://www.unicode.org/reports/tr10/tr10-19.html#Modifications. Among other changes, the revised text for UTS #10 makes it clear that the BASE for implicit generation of weights for Han characters does not include unassigned code points.

The default table contains weights for all newly assigned characters. See: https://www.unicode.org/Public/UCA/5.2.0/allkeys-5.2.0.txt. That directory also contains collation test information that was current as of a slightly earlier version of allkeys.txt.

Please note the following changes and issues for implementation.

There are small changes in Gujarati, Telugu, Malayalam (including weighting for chillus), Tamil, and Sinhala. While these changes move in the direction of expected behavior, good results will only come from tailoring for particular languages, such as with CLDR.

There have been significant changes to the ordering of many combining marks. Many combining marks that are not in customary use in modern languages now have the same secondary weight, and will only be distinguished on a fourth level, by code point ordering. This can be seen by comparing https://www.unicode.org/charts/collation/chart_Ignorable.html (UCA5.1) with https://www.unicode.org/charts5.2/collation/chart_Ignorable.html (UCA5.2, temporary location). Note that in 5.2, many characters have a white background, indicating that they sort exactly the same as the previous character, unless a 4th (codepoint) level is used.

Implementations of UCA should take note that the increased number of characters may cause overflows if the implementing code makes certain assumptions or optimizations. This can result either from the new character additions (which increase the number of distinct weights in the table) or because of changes in the way the weights, particularly for secondary weight values, are assigned in the table. The latter change may result in unexpected numbers of characters having the same weight.

Resolution: Closed 2009-10-19. The document and associated data file will be published as Version 5.2 of the Unicode Collation Algorithm.

142 Proposed Update UAX #41: Common References for Unicode Standard Annexes 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release.

Resolution: Closed 2009-08-21. The document will be updated with feedback and published.

141 Proposed Update UAX #38: Unicode Han Database (Unihan) 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. This latest draft of UAX #38 for the Proposed Update now includes expanded information about the new "Unihan.zip" archive format, and a revised table structure for the Unihan Property descriptions. Draft updated 2009-07-20.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

140 Proposed Update UAX #34: Unicode Named Character Sequences 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

139 Proposed Update UAX #31: Unicode Identifier and Pattern Syntax 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. Draft updated 2009-06-22.

Resolution: Closed 2009-09-01. The document will be published as part of Unicode 5.2.

138 Proposed Update UAX #29: Unicode Text Segmentation 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. This update changes ZWSP to have the XX (Any) property for word boundary determination, fixing a problem which was causing words not to break at ZWSP. It also revises the section relating text boundaries to regular expressions. Draft updated 2009-03-31.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

137 Proposed Update UAX #24: Unicode Script Property 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. Draft updated 2009-03-02: Section 3 has been substantially rewritten, in particular to distinguish clearly between script designators and script property value aliases.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

136 Proposed Update UAX #14: Unicode Line Breaking Algorithm 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. A new Line_Break class CP has been added, and the rule LB30 has been reintroduced, to address an edge case involving parentheses. There are numerous other small changes to the text, both substantive and editorial. Draft updated 2009-07-08.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

135 Proposed Update UAX #11: East Asian Width 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. Draft updated 2009-03-13: Updated the description of the property value for unassigned codepoints.

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

134 Proposed Update UAX #9: Unicode Bidirectional Algorithm 2009.08.03

This UAX will be updated for Unicode 5.2, and the proposed update is now open for general public review and comment. The draft will be periodically updated during the development cycle for the release. There are explicit notes requesting feedback on three open issues whose resolution could have an effect on how bidi text is displayed. Feedback is welcome on these issues. For a listing of the changes with links to the affected text, see https://www.unicode.org/reports/tr9/tr9-20.html#Modifications. This revision also includes a new conformance test file, which implementers should carefully review. See BidiTest.txt in the data files directory: https://www.unicode.org/Public/5.2.0/ucd/ Draft updated 2009-07-08

Resolution: Closed 2009-08-21. The document will be published as part of Unicode 5.2.

133 Proposed Draft UTS #46: Unicode IDNA Compatible Preprocessing 2009.08.03

This Proposed Draft UTS provides a specification for an internationalized domain name preprocessing step that is intended for use with IDNAbis, the projected update for Internationalized Domain Names. The proposed specification maintains compatibility with IDNA2003 (the current version of Internationalized Domain Names), and consistently extends that mechanism for characters introduced in any later Unicode version.

Resolution: Closed 2009-08-21. The draft was superseded by a new proposed draft.

132 Code Point Name/Label Options 2009.01.26

After considering the feedback on Public Review Issue #129 on Code Point Labels, the UTC discussed several options, which are now being presented for public review and comment. Details are in the background document.

Resolution: Closed 2009-02-13. The UTC decided to adopt option D of PRI #132. Changes will be made in the text as documented in "Changes if we do option C" of the background document, except for the fourth bullet of section 4.8. That bullet will become an informative note about API and chart conventions.

131 Han Exemplar characters 2009.01.26

The Unicode Locales (CLDR) contains exemplar characters for each locale/language. These are the characters customarily needed for the language in question. For Han characters, the Unicode CLDR has been using a fairly small set, but there is a request to include more of the commonly used characters. There are a number of possible ways to derive this set, and the CLDR technical committee would like feedback on this. Details are in the background document.

Resolution: Closed 2009-03-25. Feedback has been received for release 1.7.

130 Word Break Property for ZWSP 2009.01.26

The Unicode Technical Committee is considering changing the Word_Break property value for ZWSP from the value WB=Format to the value WB=Other (WB=XX). Details are in the background document.

Resolution: Closed 2009-02-13. The word break property for ZWSP will be changed to "Other" for Unicode 5.2.

129 Code Point Labels: Suggested Wording Details 2008.10.27

The UTC is seeking input on the proposed text to formally define Unicode Code Point Labels. Code Point Labels would include unique strings such as "<reserved-1FF0>" for code points which have no assigned Unicode character and thus no formal Unicode character name. Details are in the background document.

Resolution: Closed 2008-11-14. A new public review issue has been posted with new wording. Please see: PRI #132: Code Point Name/Label Options

128 Proposed Update UTS #37: Ideographic Variation Database 2009.08.03

The purpose of this second draft of the Proposed Update is to clarify the conditions under which a glyphic subset is appropriate for a given base character, following the UTC discussion. Details are in the background document. Draft updated 2009-05-21.

Resolution: Closed 2009-08-21. The report is approved will be published.

127 Proposed Update UAX #44: Unicode Character Database 2009.08.03

This update is an extensive rewrite of UAX #44 in order to incorporate all of the former content of UCD.html into the annex and consolidate all of the documentation in one place.

The material from UCD.html has been reorganized, so that the documentation is clearer and flows better. Substantial new content documenting various aspects of character properties and the UCD has been added as well.

Please review the text carefully for correctness. The draft was updated on June 15, 2009.

Resolution: Closed 2009-09-01. The document will be published as part of Unicode 5.2.

126 Proposed Update UTR #17: Unicode Character Encoding Model 2008.10.27

This technical report is being updated to correct the titles for various references. The model has been resynched to bring it back up to date for Unicode 5.0. The text has also been edited fairly extensively for readability and consistency.

Resolution: Closed 2008-11-12. The report is approved and will be published.

125 Proposed Update UTR #33: Unicode Conformance Model 2008.10.27

This technical report is being updated to correct the titles for various references. The text has also been lightly edited.

Resolution: Closed 2008-11-12. The report is approved and will be published.

124 Proposed Update UTR #23: The Unicode Character Property Model 2008.10.27

This proposed update has a new note about constraints on new property additions. Titles of some references have been updated, along with other minor editing. Draft updated 2008-08-27.

Resolution: Closed 2008-11-12. The report is approved and will be published.

123 Bengali Currency Numerator Values 2008.10.27

The UTC has recently decided to encode some new fraction characters. For the new sets of fraction characters, the UTC has approved fractional numeric values consistent with the usage of the characters to represent fractions. However, the current numeric values associated with historically related Bengali characters U+09F4..U+09F8 are inconsistent with those numeric value assignments. UTC proposes to update the numeric values. Details are in the background document.

Resolution: Closed 2008-11-14. The values will be changed in accordance with the background document.

122 Proposal for Additional Deprecated Characters 2008.08.04

The Unicode Technical Committee is considering giving a number of additional characters the Deprecated property. See the background document for details.

Resolution: Closed 2008-08-29. Changes will be made to deprecated characters in the next version of the standard.

121 Recommended Practice for Replacement Characters 2008.08.04

The Unicode Technical Committee has been requested to specify what the recommended practice is for replacement characters in converting ill-formed subsequences. See the review document for further explanation.

Resolution: Closed 2008-08-29. The UTC decided to adopt option 2 of the PRI.

120 Draft UTR #45 U-Source Ideographs 2008.08.04

This new draft UTR #45 describes the U-source ideographs as used by the IRG in its CJK ideograph unification work. The draft is posted for public review and comment.

Resolution: Closed 2008-08-29. The draft was approved with changes based on feedback and will be published.

119 Proposed Update to UTR #25 Unicode Support for Mathematics 2008.08.01

The proposed update of UTR #25 makes minimal content changes, mostly consisting of corrections for typographical errors. There were also some minor formatting changes to enable generation of the text in pdf format, instead of html, and for better typography of the mathematical examples.

Resolution: Closed 2008-05-21. The proposed update was approved with changes based on feedback and will be published.

118 Proposed Draft UAX #44 Unicode Character Database 2008.01.28

The Unicode Consortium announces a new proposed draft UAX #44, Unicode Character Database. This annex consolidates information documenting the Unicode Character Database.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

117 Proposed Update to UAX #38 The Unicode Han Database (Unihan) 2008.01.28

This document has been changed to be a draft Unicode Standard Annex. Formerly it was a proposed draft Unicode Technical Report. In this update, Corrections have been made to some regular expressions. Miscellaneous layout problems and typographical errors have been corrected.

Resolution: Closed 2008-02-11. The draft was approved and will be published as part of Unicode 5.1.0.

116 Proposed Update to UTS #35 Locale Data Markup Language 2007.11.21

The Unicode CLDR committee is planning to release a minor version, 1.5.1, by the end of November. There are a few changes in the specification associated with this change, notably:
• Added C10. Likely Subtags for locale IDs or language tags
• Added extensive clarifications in Appendix J: Time Zone Display Names

Resolution: Closed 2008-02-11. Version 1.5.1 was released.

115 Proposed Update to UTR #36 Unicode Security Considerations 2008.01.28

Changes in this proposed update include:
• Added explanation of UTF-8 over consumption attack in section 3.1 UTF-8 Exploits
• Added subsection of 2.8.2 Mapping and Prohibition describing the Unicode 5.1 changes in identifiers
• Added section 3.4 Property and Character Stability

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published.

114 Proposed Update to UAX #34 Unicode Named Character Sequences 2008.01.28

There have been no internal text changes to UAX #34, this update is the pro-forma version release candidate update for Unicode 5.1.0.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

113 Proposed Update to UTS #10 Unicode Collation Algorithm 2008.01.28

This update clarifies the use of contractions in DUCET. Information has been added about the use of parameterization (section 5.1), and a new conformance clause (C6).

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published in the timeframe of Unicode 5.1.0.

112 Proposed Update to UAX #9 Unicode Bidirectional Algorithm 2008.01.28

In this update, definition BD6 has been clarified.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

111 Proposed Update to UTS #18 Unicode Regular Expressions 2008.08.04

The proposed update of UAX #18 clarifies conformance requirements for "." and CRLF, updates the syntax, incorporates the new extended grapheme clusters for Unicode 5.1, and better describes dealing with normalization, the importance of levels, and the use of wildcards in property values. Public feedback is invited.

Resolution: Closed 2008-08-29. The draft was approved with changes based on feedback and will be published.

110 Proposed Update to UAX #24 Script Names 2008.01.28

The proposed update of UAX #24 adds a new section regarding use of the script property in rendering systems, clarifies issues of script inheritance in combining character sequences, and documents the script anomalies for some East Asian squared abbreviation compatibility symbols. Public feedback is invited.

Resolution: Closed 2008-02-11. The draft was approved and will be published as part of Unicode 5.1.0.

109 Proposed Draft UAX #42: Unicode Character Database in XML 2008.01.28

This draft UAX describes an XML representation of the Unicode Character Database, and is available for public review and comment. Please see the separate background document for details of this review and how to obtain data files. Review document updated 2007-11-14.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

108 Ideographic Variation Database Submission 2007.11.25

The Ideographic Variation Database provides a registry for collections of unique variation sequences containing unified ideographs, allowing for standardized interchange according to UTS#37, Ideographic Variation Database. An updated submission to the Ideographic Variation Database has been received for: "Combined registration of the Adobe-Japan1 collection and of sequences in that collection". Details are in the background document.

Resolution: Closed 2008-01-23. The submission was accepted with minor modifications and was incorporated in version 2007-12-14 of the Ideographic Variation Database.

107 Script Property Values for some characters in U+3200..U+33FF 2007.07.30

UTC is seeking public feedback on whether to change the value of the Script property for various characters in the block U+3200..U+33FF.

Resolution: Closed 2007-08-15. No script property changes were made as a result of this public review issue.

106 Proposed Update to UAX #11: East Asian Width 2007.05.08

This proposed update adds a note on the lack of canonical equivalence for the assignment of the EAW=ambiguous property to characters, and clarifies the status of such characters at several points in the text.

Resolution: Closed 2007-05-29. The draft will be updated and posted with 5.1.0.

105 Proposed Update to UAX #14: Line Breaking Properties 2008.01.28

This proposed update for UAX#14 updates the description of linebreak classes with the line break properties in the beta version of the Unicode Character Database, version 5.0.1. The rules were updated to support the sequence <SHY, NBHY> for languages such as Polish and Portuguese. The conformance clause was updated to propose additional language on permissible higher level protocols. The entire text has been reviewed, and improved in a number of places, to make it easier to normatively reference this UAX from other specifications. Owners of other specifications (higher level protocols) are particularly encouraged to review this proposed update. Note: the line breaking rules for Ethiopic are under separate investigation.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

104 Proposed Update to UAX #31: Identifier and Pattern Syntax 2008.01.28

The proposed update of UAX #31 has changes that discuss the issue of canonical equivalence of identifiers. Public feedback is invited.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

103 Proposed Update to UAX #29: Text Boundaries 2008.01.28

The proposed update to UAX #29 fixes some items that were noted in proof for Unicode 5.0. It makes changes in the definition of "Sp" and and some break conditions in rules SB8 and SB11. Public feedback is invited.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

102 Proposed Update to UAX #15: Unicode Normalization Forms 2008.01.28

There is a Proposed Update to UAX #15, which specifies a new Normalization Process for Stabilized Strings. The key concept is that for a given normalization form, once a Unicode string has been successfully normalized according to that process, it will never change if subsequently normalized again, in any version of Unicode, past or future. This definition depends on an anticipated further tightening of the Unicode Stability Policies such that normalization of assigned characters will not change in future versions of Unicode. Details are in the proposed update itself.

Resolution: Closed 2008-02-11. The draft was approved with changes based on feedback and will be published as part of Unicode 5.1.0.

101 Proposal to Encode an External Link Sign 2007.05.08

The UTC has received a proposal to encode an EXTERNAL LINK SIGN as a character. The proposed symbol marks external links within web pages (i.e. links which lead to another site, contrary to internal links which lead to another page within the same site or domain). The submitted proposal itself is available for review, and some detailed questions for reviewers are presented in the background document.

Resolution: Closed 2007-05-29. This issue will be taken up by the symbols subcommittee to make recommendations in the context of their discussion of other symbols.

100 Giving U+00B7 MIDDLE DOT the ID_Continue Property 2007.01.30

The character U+00B7 MIDDLE DOT has the XID_Continue property, but not the ID_Continue property. It is the only character of this sort.
The UTC is considering removing this exception, thus making the set of XID_Continue characters a proper subset of ID_Continue characters. This would ensure that all valid identifiers defined using the XID_Start and XID_Continue properties would also be valid identifiers based on the ID_Start and ID_Continue properties.

The XID_Start and XID_Continue properties are improved lexical classes that incorporate the changes described in Section 5.1, NFKC Modifications of UAX #31. They are recommended for most purposes, especially for security, over the original ID_Start and ID_Continue properties.

In light of the use of MIDDLE DOT in encoding Catalan data, the UTC is soliciting feedback on this proposed property change for U+00B7 MIDDLE DOT.

Resolution: Closed 2007-02-15. The character U+00B7 MIDDLE DOT will be added to ID_Continue in the next version of the standard.