|
|
|
|
File: [Development] / draft / policies / stability_policy.html
(download)
/
(as text)
Revision: 1.15, Fri Mar 20 17:07:15 2009 UTC (8 months ago) by rick Branch: MAIN CVS Tags: HEAD Changes since 1.14: +98 -45 lines 2/26 mandated mods for alias stability policies. |
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-us">
<meta name="VI60_defaultClientScript" content="JavaScript">
<meta name="keywords" content="Unicode Standard, stability">
<title>Unicode Character Encoding Stability Policy</title>
<link rel="stylesheet" type="text/css" href="http://www.unicode.org/webscripts/standard_styles.css">
<style type="text/css">
<!--
.clauseName { font-size: 120%; font-weight: bold; margin-top: 12pt; text-decoration:underline }
.clauseApplicability {font-weight: bold; margin-top: 12pt }
.clauseStatement {font-weight: bold; margin-top: 12pt }
-->
</style>
</head>
<body text="#330000">
<table width="100%" cellpadding="0" cellspacing="0" border="0">
<tr>
<td colspan="2">
<table width="100%" border="0" cellpadding="0" cellspacing="0">
<tr>
<td class="icon"><a href="http://www.unicode.org/">
<img border="0" src="http://www.unicode.org/webscripts/logo60s2.gif" align="middle" alt="[Unicode]" width="34" height="33"></a>
<a class="bar" href="http://www.unicode.org/standard/standard.html">
<font size="3">The Standard</font></a></td>
<td class="bar"><a href="http://www.unicode.org" class="bar">Home</a>
| <a href="http://www.unicode.org/sitemap/" class="bar">Site Map</a>
| <a href="http://www.unicode.org/search/" class="bar">Search</a></td>
</tr>
</table>
</td>
</tr>
<tr>
<td colspan="2" class="gray"> </td>
</tr>
<tr>
<td valign="top" width="25%" class="navCol">
<table class="navColTable" border="0" width="100%" cellspacing="4" cellpadding="0">
<tr>
<td class="navColTitle" colspan="2">Contents</td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Encoding">Encoding Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2"><a href="#Name">
Name Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Formal_Name_Alias">Formal Name Alias Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Named_Character_Sequence">Named Character Seq. Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Name_Uniqueness">Name Uniqueness</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Normalization">Normalization Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Identity">Identity Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Property_Stability">Property Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Property_Value">Property Value Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Alias_Stability">Alias Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Property_Alias_Uniqueness">Property Alias Uniqueness</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Identifier">Identifier Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Case_Folding">
Case Folding Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="#Case_Pair">Case Pair Stability</a></td>
</tr>
<tr>
<td class="navColTitle" colspan="2">Unicode Policies</td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/policies.html#Stability">
Stability Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell"> </td>
<td valign="top" class="navColCell">
<a href="http://www.unicode.org/policies/reg_stability_policy.html">
Registered Code Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell"> </td>
<td valign="top" class="navColCell">
<a href="http://www.unicode.org/policies/locales_stability.html">
Locales Stability</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/mail_policy.html">Mail
List Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/logo_policy.html">Trademarks
and Logo Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/patent_policy.html">Patent
Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/privacy_policy.html">Privacy
Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/confidential_data_policy.html">
Confidential Data Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/policies/font_policy.html">Font
Submissions Policy</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/copyright.html">Unicode Copyright
& Terms of Use</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2"></td>
</tr>
<tr>
<td class="navColTitle" colspan="2">Related Links</td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/consortium/consort.html">Unicode
Consortium</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/consortium/policies/policies.html">
Unicode Policies</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/ucd/">Unicode Character Database</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/charts/">Code Charts</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="/versions/">Versions of the Unicode Standard</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2">
<a href="http://www.unicode.org/standard/where/">Where is my Character?</a></td>
</tr>
<tr>
<td valign="top" class="navColCell" colspan="2"><a href="/faq/">
Frequently Asked Questions</a></td>
</tr>
</table>
</td>
<!-- BEGIN CONTENTS -->
<td>
<table>
<tr>
<td class="contents" valign="top">
<div class="body">
<h1>Unicode Character Encoding Stability Policy</h1>
<p>Unlike many other standards, the Unicode Standard is continually
expanding—new characters are added to meet a variety of uses,
ranging from technical symbols to letters for archaic languages.
Character properties are also expanded or revised to meet implementation
requirements.</p>
<p>In each new version of the Unicode Standard, the Unicode
Consortium may <a href="../alloc/Pipeline.html">add characters</a>
or make certain changes to characters that were encoded in a
previous version of the standard. However, the Consortium
imposes limitations on the types of changes that can be made,
in an effort to minimize the impact on existing implementations.
</p>
<p>This <span>page</span> lists the policies of the Unicode
Consortium regarding character encoding stability.<span> These
policies are intended to ensure that text encoded in one version
of the standard remains valid and unchanged in later versions.
In many cases, the constraints imposed by these stability policies
allow implementers to simplify support for particular features
of the standard, with the assurance that their implementations
will not be invalidated by a later update to the standard.</span></p>
<p><span>The notation </span><i><span>Unicode N.n+</span></i><span>
means “The Unicode Standard, Version N.n and all subsequent
versions.” (For associated information, see the </span><i>
<span>Related Links</span></i><span> on the left.)</span></p>
<p><i>This page was last updated 26-February-2009</i></p>
<p class="clauseName"><b><a name="Encoding"></a>Encoding Stability</b></p>
<p class="clauseApplicability"><i><b><span>Applicable Version:
Unicode 2.0+</span></b></i></p>
<p class="clauseStatement"><b>Once a character is encoded, it
will not be moved or removed.</b></p>
<p>This policy ensures that implementers can always depend on
each version of the Unicode Standard being a superset of the
previous version. The Unicode Standard may deprecate the character
(that is, formally discourage its use), but it will not reallocate,
remove, or reassign the character.</p>
<blockquote>
<p><i><b>Note: </b>Ordering of characters is handled via
<a href="http://www.unicode.org/reports/tr10/">collation</a>,
<b>not</b> by moving characters to different code points. For
more information, see
<a href="http://www.unicode.org/reports/tr10/">Unicode Technical
Standard #10, Unicode Collation
Algorithm</a></i>,<i> and the Unicode
<a href="http://www.unicode.org/faq/">FAQ</a>.</i></p>
</blockquote>
<p class="clauseName"><b><a name="Name"></a>Name Stability</b></p>
<p class="clauseApplicability"><i><b><span>Applicable Version:
Unicode 2.0+</span></b></i></p>
<p class="clauseStatement">The Unicode Name property value for any non-reserved code point will not be changed. In particular, once a character
is encoded, its name will not be changed.</p>
<p>Together with the limitations in name syntax, this policy
allows implementations to create unique identifiers from character
names. The character names are used to distinguish <i>between</i>
characters and do not always express the full meaning of each
character. They are designed to be used programmatically and,
therefore, must be stable.</p>
<p>In some cases the original name chosen to represent the character
is inaccurate in one way or another. Any such inaccuracies are
dealt with by adding annotations to the
<a href="http://www.unicode.org/charts/">character name list</a>
(which is also printed in the Unicode Standard and provided
in a
<a href="http://www.unicode.org/Public/UNIDATA/NamesList.html">
machine-readable format</a>), or by adding descriptive text
to the standard. In cases of outright errors in character names such as
misspellings, a character may be given a formal name alias. </p>
<blockquote><p><i><b>Note: </b>It is possible to produce translated names
for the characters, to make the information conveyed by the
name accessible to non-English speakers.</i></p></blockquote>
<p class="clauseName"><b><a name="Formal_Name_Alias"></a>Formal Name Alias
Stability</b></p>
<p class="clauseApplicability"><i><b><span>Applicable Version:
Unicode 5.0+</span></b></i></p>
<p class="clauseStatement">Formal aliases, once assigned to
a character, will not be changed or removed. </p>
<p>Formal aliases are defined in the file
<a href="http://www.unicode.org/Public/UNIDATA/NameAliases.txt">NameAliases.txt</a> in
the <a href="http://www.unicode.org/ucd/">Unicode Character
Database</a> and listed in the character
<a href="http://www.unicode.org/charts/">code charts</a>.</p>
<p class="clauseName"><b><a name="Named_Character_Sequence">
</a>Named Character
Sequence Stability</b></p>
<p class="clauseApplicability"><i><b><span>Applicable Version:
Unicode 5.0+</span></b></i></p>
<p class="clauseStatement">Named character sequences will not
be changed or removed. </p>
<p>This stability guarantee applies both to the name of the
named character sequence and to the sequence of characters so
named.</p>
<p>Named character sequences are defined in the file
<a href="http://www.unicode.org/Public/UNIDATA/NamedSequences.txt">NamedSequences.txt</a>
in the <a href="http://www.unicode.org/ucd/">Unicode Character
Database</a>. For more information on named character sequences,
see <a href="http://www.unicode.org/reports/tr34/">Unicode
Standard Annex #34,
Unicode Named Character Sequences</a>. </p>
<blockquote><p><i><b>Note: </b>There are also <b>provisional</b> named character
sequences, which are included in the Unicode Character Database
but are not covered by this stability policy. </i></p></blockquote>
<p class="clauseName"><b><a name="Name_Uniqueness"></a>Name Uniqueness</b></p>
<p class="clauseApplicability"><i><b><span>Applicable Version:
Unicode 2.0+</span></b></i></p>
<p class="clauseStatement"><b>The names of characters, formal
aliases, and named character sequences are unique within a shared
namespace.</b></p>
<p>The names of characters, named character sequences, and formal
aliases for characters share a single namespace in which each
name uniquely identifies either a single character or a single
named character sequence. The definition of uniqueness is not
just a simple comparison of the characters—instead, the loose
matching rules from
<a href="http://www.unicode.org/Public/UNIDATA/UCD.html">UCD.html</a> in the
<a href="http://www.unicode.org/ucd/">Unicode Character Database</a>
are used.</p>
<blockquote><p><i><b>Note:</b> As of Unicode 4.1, named character sequences were added to this shared namespace; as of Unicode 5.0, formal aliases were also added.</i></p></blockquote>
<p class="clauseName"><b><a name="Normalization"></a>Normalization
Stability</b></p>
<p class="clauseApplicability"><u><i>Strong</i></u><i><span><u>
Normalization Stability</u><br>
Applicable Version: Unicode 4.1+</span></i></p>
<b>If a string contains only characters from a given version
of Unicode, and it is put into a normalized form in accordance
with that version of Unicode, then the results will be identical
to the results of putting that string into a normalized form
in accordance with any subsequent version of Unicode.</b><p>
More formally, given versions V and U of Unicode, and any string
S which only contains characters assigned according to both
V and U, the following are always true:</p>
<p align="center">toNFC<sub>V</sub>(S) = toNFC<sub>U</sub>(S)<br>
toNFD<sub>V</sub>(S) = toNFD<sub>U</sub>(S)<br>
toNFKC<sub>V</sub>(S) = toNFKC<sub>U</sub>(S)<br>
toNFKD<sub>V</sub>(S) = toNFKD<sub>U</sub>(S)</p>
<p><span>In particular, once a character is encoded, its canonical
combining class and decomposition mapping will not be changed
in any way.</span></p>
<p><b><i>Decomposition Mapping</i></b> </p>
<p class="clauseStatement">Once a character is assigned, its
decomposition mapping will not change.</p>
<p><b><i>Canonical Combining Class</i></b></p>
<p class="clauseStatement">Once a character is assigned, its
canonical combining class will not change.</p>
<blockquote>
<p><i><b>Note: </b>If an implementation normalizes a string
that contains characters that are <b>not</b> assigned in
the version of Unicode that it supports, that string <b>
might not</b> be in normalized form according to a future
version of Unicode. For example, suppose that a Unicode
5.0 program normalizes a string that contains new Unicode
5.1 characters. That string might not be normalized according
to Unicode 5.1.</i></p>
</blockquote>
<hr width="50%">
<p class="clauseApplicability"><i><span><u>Weaker Version of
Normalization Stability</u><br>
Applicable Version: Unicode 3.1+</span></i></p>
<p class="clauseApplicability">
<span style="font-weight: 400; font-style: italic">Note that
all of the guarantees implied by this weaker specification are
subsumed by the stricter stability constraints applicable to
Version 4.1 and later.</span></p>
<p class="clauseStatement"><span>If a string contains only characters
from a given version of the Unicode, and it is put into a normalized
form in accordance with that version of Unicode, then the result
will also be in that normalized form according to any subsequent
version of Unicode. </span></p>
<p class="clauseStatement"><span>The result will also be in
that normalized form according to any prior version of the standard
that contains all of the characters in the string (back to the
first applicable version, Unicode 3.1).</span></p>
<p><span>In particular, once a character is encoded, its canonical
combining class and decomposition mapping will not be changed
in a way that will destabilize normalization. Thus the following
constraints will be maintained under all circumstances:</span></p>
<p><b><i>Decomposition Mapping</i></b> </p>
<p class="clauseStatement">The decomposition mapping may not
be changed except for the correction of exceptional errors which
meet all of the following conditions (1-3):</p>
<ol>
<li>
<p class="clauseStatement">There is a clear and evident
error identified in the Unicode Character Database (such
as a typographic mistake).</p>
</li>
<li>
<p class="clauseStatement">The error constitutes a clear
violation of the identity stability policy.</p>
</li>
<li>
<p class="clauseStatement">The correction of such an error
does not violate the following constraints (a-d):</p>
<ol type="a">
<li>
<p class="clauseStatement">No character will be given
a decomposition mapping when it did not previously have
one.</p>
</li>
<li>
<p class="clauseStatement">No decomposition mapping
will be removed from a character.</p>
</li>
<li>
<p class="clauseStatement">No decomposition mapping
will change in type (canonical to compatibility, or
vice versa).</p>
</li>
<li>
<p class="clauseStatement">The number of characters
in a decomposition mapping will not change.</p>
</li>
</ol>
</li>
</ol>
<p><b><i>Canonical Combining Class</i></b> </p>
<p class="clauseStatement">Once a character is assigned, its
canonical combining class will not change.</p>
<blockquote>
<p><i><b>Note: </b>If an implementation normalizes a string
that contains characters that are <b>not</b> assigned in
the version of Unicode that it supports, that string <b>
might not</b> be in normalized form according to a future
version of Unicode. For example, suppose that a Unicode
<span>4</span>.0 program normalizes a string that contains
new Unicode <span>4</span>.1 characters. That string might
not be normalized according to Unicode <span>4</span>.1.</i></p>
<p><i><span><b>Note: </b>In versions prior to Unicode 4.1,
there were exceptional cases where the normalization algorithm
had to be applied twice to put a string into normalized
form. See </span></i><span>
<a href="http://www.unicode.org/versions/corrigendum5.html">
<i>Corrigendum #5: Normalization Idempotency</i></a> <i>
and</i> <a href="http://www.unicode.org/reports/tr15/">
<i>Unicode Standard Annex #15, Unicode Normalization Forms</i></a>.</span></p>
</blockquote>
<p class="clauseName"><b><a name="Identity"></a>Identity
Stability</b></p>
<p class="clauseApplicability"><i><span>Applicable Version:
Unicode 1.1+</span></i></p>
<p class="clauseStatement"><b>Once a character is encoded, its
properties may still be changed, but <i>not</i> in such a way
as to change the fundamental identity of the character.</b></p>
<p>The Consortium will endeavor to keep the values of the other
properties as stable as possible, but some circumstances may
arise that require changing them. Particularly in the situation
where the Unicode Standard first encodes less well-documented
characters and scripts, the exact character properties and behavior
initially may not be well known.</p>
<p>As more experience is gathered in implementing the characters,
adjustments in the properties may become necessary. Examples
of such properties include, but are not limited to, the following:
</p>
<ul type="square">
<li>General_Category</li>
<li>Case mappings</li>
<li>Bidirectional properties</li>
<li>Compatibility decomposition tags (such as <code><font></code>
or <code><compat></code>)</li>
<li>Representative glyphs</li>
</ul>
<p>However, character properties will <i>not</i> be changed
in a way that would affect character identity. For example,
the representative glyph for U+0061 “A” cannot be changed to
“B”; the General_Category for U+0061 “A” cannot be changed to
Ll <i>(lowercase letter);</i> and the decomposition mapping
for U+00C1 (Á) cannot be changed to <U+0042, U+0301> (B, ´).</p>
<p class="clauseName"><b><a name="Property_Stability">Property Stability</a></b></p>
<p class="clauseApplicability"><i><span>Applicable Version:
Unicode 5.2+</span></i></p>
<p class="clauseApplicability"><b>Normative and informative properties, once defined in the Unicode Character Database, will never be removed.</b></p>
<p>This stability guarantee does not apply to Contributory properties (such as "Other_Alphabetic") nor to Provisional properties. For a list of which properties are Normative or Informative, see the file
<a href="http://www.unicode.org/Public/UNIDATA/UCD.html">UCD.html</a> in the
<a href="http://www.unicode.org/ucd/">Unicode Character Database</a>.</p>
<p>In prior versions of the Unicode Standard, the <i>only</i> non-provisional property that has ever been withdrawn from the standard was the informative property Special_Case_Condition, which was removed as of Unicode 5.1.</p>
<p>This policy does not preclude the deprecation of a Unicode character property. Such deprecation would not remove the property; it would only indicate a strong recommendation not to use it.</p><p class="clauseName"><b><a name="Property_Value"></a>Property
Value Stability</b></p>
<p class="clauseStatement"><span style="font-weight: 400">Values of certain properties are limited by the constraints listed in the table
below. The applicable version is given in the first column. A version of this
<a href="http://www.unicode.org/policies/property_value_stability_table.html">table ordered by property</a> is also available.</span></p>
<table cellspacing="0" cellpadding="4" style="border-collapse: collapse; " border="1">
<tr>
<td valign="top" style="border: 1.0pt solid windowtext; padding-left: 5.4pt; padding-right: 5.4pt; padding-top: 0in; padding-bottom: 0in" align="left">
<p><b>Applicable Unicode Versions</b></p>
</td>
<td valign="top" style="border-left: medium none; border-right: 1.0pt solid windowtext; border-top: 1.0pt solid windowtext; border-bottom: 1.0pt solid windowtext; padding-left: 5.4pt; padding-right: 5.4pt; padding-top: 0in; padding-bottom: 0in" align="left">
<p><b>Constraints</b></p>
</td>
</tr>
<tr>
<td valign="top" nowrap align="left" rowspan="3">
<p>1.1.5+</p>
<p> </p>
<p> </p>
</td>
<td align="left" valign="top">
<p>The <b>General_Category</b> property value <b>Control (Cc)</b>
is immutable: the set of code points with that value
will never change.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">
<p>The <b>Canonical_Combining_Class</b> property values
are limited to the values 0 to 255.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">
<p>All characters other than those with <b>General_Category</b>
property values <b>Spacing_Mark (Mc) </b>and <b>Nonspacing_Mark
(Mn)</b> have the <b>Canonical_Combining_Class</b> property
value <b>0</b>.</p>
</td>
</tr>
<tr>
<td valign="top" nowrap align="left" rowspan="6">
<p>2.0.0+</p>
<p> </p>
<p> </p>
<p> </p>
</td>
<td align="left" valign="top">
<p>The <b>General_Category</b> property value <b>Private_Use
(Co)</b> is immutable: the set of code points with that
value will never change.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">
<p>The <b>General_Category</b> property value <b>Surrogate
(Cs) </b>is immutable: the set of code points with that
value will never change.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is assigned, both its <b>Name</b> and its
<b>Jamo_Short_Name</b> will never
change.</td>
</tr>
<tr>
<td align="left" valign="top">
<p>Canonical and compatibility mappings (<b>Decomposition_Mapping</b>
property values) are always in canonical order, and
the resulting recursive decomposition will also be in
canonical order. </p>
</td>
</tr>
<tr>
<td align="left" valign="top">
<p>Canonical mappings (<b>Decomposition_Mapping</b>
property values) are always limited either to a single
value or to a pair. The second character in the pair
cannot itself have a canonical mapping.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">Canonical mappings (<b>Decomposition_Mapping</b>
property values) are always limited so that no string
when normalized to NFC expands to more than 3× in length
(measured in code units).</td>
</tr>
<tr>
<td valign="top" nowrap align="left">
<p>2.1.3+</p>
</td>
<td align="left" valign="top">
<p>The <b>General_Category</b> property values will
not be further subdivided. </p>
</td>
</tr>
<tr>
<td valign="top" nowrap align="left" rowspan="2">
<p>3.0.0+</p>
</td>
<td align="left" valign="top">
<p>The <b>Bidi_Class </b>property values will not be
further subdivided. </p>
</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is assigned, its <b>Canonical_Combining_Class</b>
will never change.</td>
</tr>
<tr>
<td valign="top" nowrap align="left" rowspan="8">3.0.1+</td>
<td align="left" valign="top">The <b>Case_Folding</b> property value is limited
so that no string when case folded expands to more than
3× in length (measured in code units).</td>
</tr>
<tr>
<td align="left" valign="top">The <b>Noncharacter_Code_Point</b> property is an immutable code point property, which
means that its property values for all Unicode code points will never change.</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is
<b>ID_Continue</b>, it must continue to be so in all
future versions.</td>
</tr>
<tr>
<td align="left" valign="top">If a character is <b>ID_Start</b> then it must also be
<b>ID_Continue</b>.</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is <b>ID_Start</b>, it must continue to be so in all future versions.</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is <b>XID_Continue</b>, it must continue to be so in all future
versions.</td>
</tr>
<tr>
<td align="left" valign="top">If a character is <b>XID_Start</b> then it must also be
<b>XID_Continue</b>.</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is <b>XID_Start</b>, it must continue to be so in all future versions.</td>
</tr>
<tr>
<td valign="top" nowrap align="left">
<p>3.1.0+</p>
</td>
<td align="left" valign="top">
<p>The <b>Noncharacter_Code_Point</b> property is an
immutable code point property, which means that its
property values for all Unicode code points will never
change.</p>
</td>
</tr>
<tr>
<td valign="top" nowrap align="left" rowspan="3">
<p>4.0.0+</p>
</td>
<td align="left" valign="top">
<p>The property values for the bidirectional
properties <b>Bidi_Class</b> and <b>Bidi_Mirrored</b> preserve
canonical equivalence.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">
The set of characters having <b>General_Category</b>=Nd will always be the same as the
set of characters having <b>Numeric_Type</b>=de.</td>
</tr>
<tr>
<td align="left" valign="top">Once a character is assigned, its <b>Decomposition_Mapping</b>
will never change.</td>
</tr>
<tr>
<td valign="top" nowrap align="left" rowspan="3">
<p>4.1.0+</p>
<p> </p>
</td>
<td align="left" valign="top">
<p>All characters with the <b>Lowercase</b> property
and all characters with the <b>Uppercase</b> property
have the <b>Alphabetic</b> property .</p>
</td>
</tr>
<tr>
<td align="left" valign="top">
<p>The <b>Pattern_Syntax</b> and <b>Pattern_White_Space</b>
properties are immutable code point properties, which
means that their property values for all Unicode code
points will never change.</p>
</td>
</tr>
<tr>
<td align="left" valign="top">
If a character has the
<b>Pattern_Syntax</b> or
<b>Pattern_White_Space</b> property, then it cannot have the <b>ID_Continue</b>
or <b>XID_Continue</b> property.</td>
</tr>
</table>
<p>These constraints ensure that implementers can simplify or
optimize certain aspects of their support for character properties.
For further description of these invariants, see the file
<a href="http://www.unicode.org/Public/UNIDATA/UCD.html">UCD.html</a>
in the <a href="http://www.unicode.org/ucd/">Unicode Character Database</a>.</p>
<p class="clauseName"><b><a name="Alias_Stability">Alias Stability</a></b></p>
<p class="clauseApplicability"><i><span>Applicable Version:
Unicode 5.1+</span></i></p>
<p class="clauseApplicability">Property aliases and property value aliases, once defined in the Unicode Character Database, will never be removed.</p>
<p>Property aliases are defined in the file
<a href="http://www.unicode.org/Public/UNIDATA/PropertyAliases.txt">PropertyAliases.txt</a> in the
<a href="http://www.unicode.org/ucd/">Unicode Character Database</a>. Property value aliases are defined in the file
<a href="http://www.unicode.org/Public/UNIDATA/PropertyValueAliases.txt">PropertyValueAliases.txt</a> in the Unicode Character Database.</p>
<p>This stability guarantee does not apply to aliases for Contributory properties (such as "Other_Alphabetic") and their values, nor to aliases for Provisional properties and their values. For a list of which properties are Normative or Informative, see the file
<a href="http://www.unicode.org/Public/UNIDATA/UCD.html">UCD.html</a> in the Unicode Character Database.</p>
<p>This policy does not preclude the deprecation of a property alias or a property value alias. Such deprecation would not remove the alias; it would only indicate a strong recommendation not to use it.</p>
<p>This stability guarantee makes it possible to use property aliases and property value aliases as stable identifiers. For example, aliases may be used as stable identifiers in Unicode Regular Expressions (see
<a href="http://www.unicode.org/reports/tr18/">Unicode Technical Standard #18, Unicode Regular Expressions</a>).</p>
<p>Note that the stability guarantee for property aliases and property value aliases does not imply that the set of characters with a given Unicode character property value is stable for all Unicode versions. New characters may be added to the standard and thus to such a set, and existing characters may have one (or more) property values changed, and thus be removed from (or added to) such a set.</p>
<p>For backwards compatibility, implementations should always support all of the aliases in PropertyAliases.txt and PropertyValueAliases.txt for any of the Unicode character properties that they support, and should always follow the
<a href="http://www.unicode.org/Public/UNIDATA/UCD.html#Property_and_Property_Value_Matching">property matching rules</a> specified in the Unicode Character Database. For example, differences in case, the presence of underscores, or similar differences in strings representing aliases are not considered to make a distinction when matching aliases.</p>
<p class="clauseName"><a name="Property_Alias_Uniqueness"><b>Property Alias </b>Uniqueness</a></p>
<p class="clauseApplicability"><i><span>Applicable Version:
Unicode 3.2+</span></i></p>
<p class="clauseApplicability">All property aliases constitute a single namespace. Property aliases are guaranteed to be unique within this namespace.</p>
<p class="clauseApplicability">For each property, all of its property value aliases constitute a separate namespace, one per property. Within each of these property value alias namespaces, property value aliases are guaranteed to be unique.</p>
<p>For the purposes of these uniqueness guarantees, uniqueness is defined by the
<a href="http://www.unicode.org/Public/UNIDATA/UCD.html#Property_and_Property_Value_Matching">
property matching rules</a> specified in the
<a href="http://www.unicode.org/ucd/">Unicode Character Database</a>. For example, differences in case, the presence of underscores, or similar differences in strings representing aliases are not considered to make a distinction when matching aliases.</p>
<p class="clauseName"><span><a name="Identifier"></a>Identifier
Stability</span></p>
<p class="clauseApplicability"><i><span>Applicable Version: Unicode
3.0+</span></i></p>
<p class="clauseStatement"><span>All strings that are valid
default Unicode identifiers will continue to be valid default
Unicode identifiers in all subsequent versions of Unicode. Furthermore,
default identifiers never contain characters with the Pattern_Syntax
or Pattern_White_Space properties.</span></p>
<p><span>If a string qualifies as an identifier under one version
of Unicode, it will qualify as an identifier under all future
versions. The reverse is not true—an identifier under Version
5.0 may not be an identifier under Version 4.0—it may contain
a character that was unassigned under Unicode 4.0, or (very
rarely) a Unicode 4.0 character that was not an identifier character
in Unicode 4.0, but became one in Unicode 5.0.</span></p>
<p><span>For more information, see <a href="http://www.unicode.org/reports/tr31/">Unicode Standard Annex
#31, Unicode Identifier and Pattern Syntax</a>.</span></p>
<p class="clauseName"><a name="Case_Folding"></a>Case Folding Stability</p>
<p class="clauseApplicability"><i>Applicable Version: Unicode 5.0+
</i>
</p>
<p class="clauseStatement">Caseless matching of Unicode strings
used for identifiers is stable. </p>
<p>Case folding stability ensures that identifiers created in
different versions of Unicode can be reliably matched in a case-insensitive
manner. For more information on identifiers see
<a href="http://www.unicode.org/reports/tr31/">Unicode
Standard Annex #31,
Unicode Identifier
and Pattern Syntax</a>. Identifiers commonly exclude compatibility
decomposable characters; therefore this policy formally applies
only to strings normalized with NFKC. The toCaseFold() operation
used for caseless matching is the full case folding defined
by rule R4 under “Default Case Conversion” in
<a href="http://www.unicode.org/versions/Unicode5.0.0/ch03.pdf#G33992">Section 3.13,
<i>Default Case Algorithms</i></a> of the Unicode Standard.</p>
<p>The formal statement of this policy is: </p>
<blockquote>
<p class="clauseStatement">
<span style="font-weight: 400">For each string S containing
characters only from a given Unicode version, toCasefold(toNFKC(S))
under that version is identical to toCasefold(toNFKC(S))
under any later version of Unicode. </span></p>
</blockquote>
<p class="clauseName"><a name="Case_Pair">Case Pair
Stability</a></p>
<p class="clauseApplicability"><i>Applicable Version: Unicode 5.0+
</i>
</p>
<p class="clauseApplicability"><span style="font-weight: 400">Two distinct assigned characters form a
<i>case pair</i> when
first character of the pair is the full uppercase of the second character, and the second character is the
full lowercase of the first character. (Full upper-and lowercase are defined in
Section 3.13 of the Unicode Standard.)</span></p>
<p><b>If two characters form a case pair in a version of Unicode,
they will remain a case pair in each subsequent version of Unicode.</b></p>
<p><b>If two characters do not form a case pair in a version
of Unicode, they will never become a case pair in any subsequent
version of Unicode.</b></p>
<p>More formally, for given versions V and U of Unicode, and
any two distinct characters X and Y that are both assigned
according to both V and U:</p>
<p align="center">toLowercase<sub>V</sub>(X) = Y AND toUppercase<sub>V</sub>(Y)
= X</p>
<p align="center">if and only if</p>
<p align="center">toLowercase<sub>U</sub>(X) = Y AND toUppercase<sub>U</sub>(Y)
= X</p>
<p>Note that these conditions apply to two existing,
distinct assigned
characters. A character that is not part of a case pair could
become part of one if the new case pair is formed
at the time of the addition of a new character to Unicode. For
example, a new capital version of U+028D ( ʍ ) LATIN SMALL LETTER
TURNED W could be added in the future to form a new case pair.</p>
<hr width="50%">
<div align="center">
<center>
<table cellspacing="0" cellpadding="0" border="0">
<tr>
<td>
<a href="http://www.unicode.org/copyright.html">
<img src="http://www.unicode.org/img/hb_notice.gif" border="0" alt="Access to Copyright and terms of use" width="216" height="50"></a></td>
</tr>
</table>
<script language="Javascript" type="text/javascript" src="http://www.unicode.org/webscripts/lastModified.js">
</script>
</center></div>
</div>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>
| No CVS admin address has been configured |
Powered by ViewCVS 0.9.3 |