|
|
|
|
File: [Development] / draft / reports / tr35 / Attic / tr35-copy.html
(download)
/
(as text)
Revision: 1.7, Tue Jun 3 16:10:08 2008 UTC (17 months, 2 weeks ago) by pedberg Branch: MAIN CVS Tags: HEAD Changes since 1.6: +4 -4 lines FILE REMOVED remove old temp copy |
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="en-us">
<meta name="GENERATOR" content="Microsoft FrontPage 6.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<link rel="stylesheet" href="http://unicode.org/reports/reports.css" type="text/css">
<title>UTS #35: Locale Data Markup Language</title>
<style type="text/css">
<!--
span.dtd { font-family: monospace; font-size:90%; text-indent:-3em; margin-left:3em; background-color:#CCCCFF; border-style: dotted; border-width: 1px;}
span.blockedInherited { font-style: italic; font-weight: bold; border-style: dashed; border-width: 1px;
background-color: #FF0000 }
span.inherited { font-weight: bold; border-style: dashed; border-width: 1px; background-color:
#00FF00 }
span.element { font-weight: bold; color: red; border-style: dotted; border-width: 1px; border-color: red }
span.attribute { font-weight: bold; color: maroon; border-style: dotted; border-width: 1px; border-color: maroon}
span.attributeValue { font-weight: bold; color: blue; border-style: dotted; border-width: 1px; border-color: blue}
span.changed {order-style: dotted; border-width: 1px; background-color: #FFFF00}
li, p, table { margin-top: 0.5li; margin-bottom: 0.5li }
-->
</style>
</head>
<body bgcolor="#ffffff">
<table class="header" width="100%">
<tr>
<td class="icon"><a href="http://unicode.org">
<img align="middle" alt="[Unicode]" border="0" src="http://unicode.org/webscripts/logo60s2.gif" width="34" height="33"></a> <a class="bar" href="http://unicode.org/reports">Technical
Reports</a></td>
</tr>
<tr>
<td class="gray"> </td>
</tr>
</table>
<div class="body">
<h2 align="center"><i><font color="#FF3333">Working Draft </font></i>Unicode Technical Standard
#35</h2>
<h1 align="right">Locale Data Markup Language (LDML)</h1>
<table border="1" cellpadding="2" cellspacing="0" width="90%">
<tr>
<td>Version</td>
<td>1<span>.4<i><font color="#FF3333"> (draft $Revision: 1.7 $)</font></i></span></td>
</tr>
<tr>
<td>Authors</td>
<td><a href="http://unicode.org/reporting.html">Mark Davis</a></td>
</tr>
<tr>
<td>Date</td>
<td><i><font color="#FF3333">$Date: 2008/06/03 17:10:08 $</font></i></td>
</tr>
<tr>
<td>This Version</td>
<td><i><a href="http://unicode.org/cldr/data/docs/web/tr35.html">
http://unicode.org/cldr/data/docs/web/tr35.html</a></i></td>
</tr>
<tr>
<td>Previous Version</td>
<td><span><a href="http://unicode.org/reports/tr35/tr35-5.html">
http://unicode.org/reports/tr35/tr35-5.html</a></span></td>
</tr>
<tr>
<td>Latest Version</td>
<td><a href="http://unicode.org/reports/tr35/">http://unicode.org/reports/tr35/</a></td>
</tr>
<tr>
<td>Corrigenda</td>
<td><a href="http://unicode.org/cldr/corrigenda.html">http://unicode.org/cldr/corrigenda.html</a>
</td>
</tr>
<tr>
<td>Latest Working Draft</td>
<td><a href="http://unicode.org/draft/reports/tr35/tr35.html">
http://unicode.org/draft/reports/tr35/tr35.html</a> </td>
</tr>
<tr>
<td>Namespace:</td>
<td><a href="http://unicode.org/cldr/">http://unicode.org/cldr/</a></td>
</tr>
<tr>
<td>DTDs:</td>
<td><a href="http://unicode.org/cldr/dtd/1.3/ldml.dtd">
http://unicode.org/cldr/dtd/1.3/ldml.dtd</a><br>
<a href="http://unicode.org/cldr/dtd/1.3/ldmlSupplemental.dtd">
http://unicode.org/cldr/dtd/1.3/ldmlSupplemental.dtd</a></td>
</tr>
<tr>
<td>Revision</td>
<td><a href="#Modifications">6</a></td>
</tr>
</table>
<p><br>
</p>
<h3><i>Summary</i></h3>
<p>This document describes an XML format (<i>vocabulary</i>) for the exchange of structured locale
data.<span class="changed"> This format is used in the Common Locale Data
Repository maintained by the Unicode Consortium.</span></p>
<h3><i>Status</i></h3>
<p><i>This document has been reviewed by Unicode members and other interested parties, and has
been approved for publication by the Unicode Consortium. This is a stable document and may be used
as reference material or cited as a normative reference by other specifications.</i></p>
<blockquote>
<p><span><i><b>A Unicode Technical Standard (UTS)</b> is an independent specification.
Conformance to the Unicode Standard does not imply conformance to any UTS.</i></span></p>
</blockquote>
<p><i><span>Please submit corrigenda and other comments with the online reporting form [<a href="#Feedback">Feedback</a>].
Related information that is useful in understanding this document is found in the
<a href="#References">References</a>. For the latest version of the Unicode Standard see [<a href="#Unicode">Unicode</a>].
For a list of current Unicode Technical Reports see [<a href="#Reports">Reports</a>]. For more
information about versions of the Unicode Standard, see [<a href="#Versions">Versions</a>]. For
possible errata for this document, see [<a href="http://unicode.org/errata/">Errata</a>].</span></i></p>
<h2><a name="Contents">Contents</a></h2>
<ul class="toc">
<li>1 <a href="#Introduction">Introduction</a></li>
<li>2 <a href="#Locale">What is a Locale?</a></li>
<li>3 <a href="#Identifiers">Identifiers</a><ul class="toc">
<li><span class="changedspan">3.1 <a href="#Unknown_or_Invalid_Identifiers">
Unknown or Invalid Identifiers</a></span></li>
</ul>
</li>
<li>4 <a href="#Locale_Inheritance">Locale Inheritance</a><ul class="toc">
<li>4.1 <a href="#Multiple_Inheritance">Multiple Inheritance</a></li>
</ul>
</li>
<li>5 <a href="#XML_Format">XML Format</a>
<ul class="toc">
<li>5.1 <a href="#Common_Elements">Common Elements</a>
<ul class="toc">
<li>5.1.1 <a href="#Escaping_Characters">Escaping Characters</a></li>
</ul>
</li>
<li>5.2 <a href="#Common_Attributes">Common Attributes</a></li>
<li>5.3 <a href="#<identity>"><identity></a></li>
<li>5.4 <a href="#<localeDisplayNames>"><localeDisplayNames></a></li>
<li>5.5 <a href="#<layout>"><layout></a></li>
<li>5.6 <a href="#<characters>"><characters></a></li>
<li>5.7 <a href="#<delimiters>"><delimiters></a></li>
<li>5.8 <a href="#<measurement>"><measurement></a></li>
<li>5.9 <a href="#<dates>"><dates></a>
<ul class="toc">
<li>5.9.1 <a href="#<calendars>"><calendars></a></li>
<li>5.9.2 <a href="#<timeZoneNames>"><timeZoneNames></a></li>
</ul>
</li>
<li>5.10 <a href="#<numbers>"><numbers></a><ul class="toc">
<li><span class="changedspan">5.10.1 <a href="#Number_Symbols">Number Symbols</a></span></li>
<li><span class="changedspan">5.10.2 <a href="#Currencies">Currencies</a></span></li>
<li><span class="changedspan">5.10.1 <a href="#Rule-Based_Number_Formats">Rule-Based Number
Formats</a></span></li>
</ul>
</li>
<li>5.11 <a href="#<posix>"><posix></a></li>
<li>5.12 <a href="#references_element"><references></a></li>
<li>5.13 <a href="#<collations>"><collations></a>
<ul class="toc">
<li>5.13.1 <a href="#<collation>"><collation></a></li>
</ul>
</li>
<li><span class="changedspan">5.14 <a href="#Segmentations">Segmentations</a></span></li>
<li><span class="changedspan"><span style="background-color: #FFFF00">5.15
<a href="#Transforms">Transforms</a></span></span></li>
</ul>
<p>Appendix A: <a href="#Sample_Special_Elements">Sample Special Elements</a>
<ul class="toc">
<li><span class="removedspan">A.1 <a href="#ICU">ICU</a> </span>
<ul class="toc">
<li><span class="removedspan">A1.1 <a href="#<ruleBasedNumberFormat>"><ruleBasedNumberFormat></a></span></li>
<li><span class="removedspan">A1.2 <a href="#<boundaries>"><boundaries></a></span></li>
<li><span class="removedspan">A1.3 <a href="#<transforms>"><transforms></a></span></li>
</ul>
</li>
<li>A.2 <a href="#OpenOffice">openoffice.org</a></li>
</ul>
</li>
<li>Appendix B: <a href="#Transmitting_Locale_Information">Transmitting Locale Information</a>
<ul class="toc">
<li>B.1 <a href="#Message_Formatting_and_Exceptions">Message Formatting and Exceptions</a></li>
</ul>
</li>
<li>Appendix C: <a href="#Supplemental_Data">Supplemental Data</a></li>
<li>Appendix D: <a href="#Language_and_Locale_IDs">Language and Locale IDs</a></li>
<li>Appendix E: <a href="#Unicode_Sets">Unicode Sets</a></li>
<li>Appendix F: <a href="#Date_Format_Patterns"><span>Date Format Patterns</span></a></li>
<li>Appendix G: <a href="#Number_Format_Patterns"><span>Number Format Patterns</span></a></li>
<li>Appendix H: <a href="#Choice_Patterns"><span>Choice Patterns</span></a></li>
<li>Appendix I: <a href="#Inheritance_and_Validity"><span>Inheritance and Validity</span></a></li>
<li>Appendix J: <a href="#Time_Zone_Fallback"><span>Time Zone Display Names</span></a></li>
<li>Appendix K: <span><a href="#valid_attribute_values">Valid Attribute Values</a></span></li>
<li>Appendix L: <a href="#Canonical_Form">Canonical Form</a></li>
<li><span class="changedspan">Appendix M: <a href="#Coverage_Levels">Coverage Levels</a></span></li>
<li><span class="changedspan"><span style="background-color: #FFFF00">
Appendix N:
<a href="#Transform_Rules">Transform Rules</a></span></span></li>
<li><span class="changedspan"><span style="background-color: #FFFF00">Appendix
O:
<a href="#Lenient_Parsing">Lenient Parsing</a></span></span></li>
<li><a href="#Acknowledgments">Acknowledgments</a></li>
<li><a href="#References">References</a></li>
<li><a href="#Modifications"><span>Modifications</span></a></li>
</ul>
<h2>1. <a name="Introduction">Introduction</a></h2>
<p>Not long ago, computer systems were like separate worlds, isolated from one another. The
internet and related events have changed all that. A single system can be built of many different
components, hardware and software, all needing to work together. Many different technologies have
been important in bridging the gaps; in the internationalization arena, Unicode has provided a
lingua franca for communicating textual data. But there remain differences in the locale data used
by different systems.</p>
<p>Common, recommended practice for internationalization is to store and communicate
language-neutral data, and format that data for the client. This formatting can take place on any
of a number of the components in a system; a server might format data based on the user's locale,
or it could be that a client machine does the formatting. The same goes for parsing data, and
locale-sensitive analysis of data.</p>
<p>But there remain significant differences across systems and applications in the
locale-sensitive data used for such formatting, parsing, and analysis. Many of those differences
are simply gratuitous; all within acceptable limits for human beings, but resulting in different
results. In many other cases there are outright errors. Whatever the cause, the differences can
cause discrepancies to creep into a heterogeneous system. This is especially serious in the case
of collation (sort-order), where different collation caused not only ordering differences, but
also different results of queries! That is, with a query of customers with names between "Abbot,
Cosmo" and "Arnold, James", if different systems have different sort orders, different lists will
be returned. (For comparisons across systems formatted as HTML tables, see [<a href="#Comparisons">Comparisons</a>].)</p>
<p><span class="removedspan">There are a number of steps that can be taken to
improve the situation. The first is to provide
an XML format for locale data interchange. This provides a common format for systems to
interchange data so that they can get the same results. The second is to gather up locale data
from different systems, and compare that data to find any differences. The third is to provide an
online repository for such data. The fourth is to have an open process for reconciling differences
between the locale data used on different systems and validating the data, to come up with a
useful, common, consistent base of locale data.</span></p>
<p class="note"><b>Note:</b> There are many different equally valid ways in which data can be
judged to be "correct" for a particular locale. The goal for the common locale data is to make it
as consistent as possible with existing locale data, and acceptable to users in that locale.</p>
<p>This document specifies an XML format for the communication of locale
data<span class="changed">: the Locale Data Markup Language (LDML). This
provides a common format for systems to interchange locale data so that they
can get the same results in the services provided by internationalization
libraries. It also provides a standard format that can allow users to
customize the behavior of a system. </span>With it, for example, collation
(sorting) rules can be exchanged, allowing two implementations to
exchange a specification of tailored collation rules. Using the same specification, the two implementations will
achieve the same results in comparing strings (see [<a href="#UCA">UCA</a>]).
<span class="changed">LDML can also be used to let a user encapsulate
specialized sorting behavior for a specific domain, or create a customized
locale for a minority language. LDML is also used in the Unicode Common
Locale Data Repository (CLDR). CLDR</span> uses an open process for
reconciling differences between the locale data used on different systems
and validating the data, to produce with a useful, common, consistent base
of locale data.</p>
<p>For more information, see the Common Locale Data Repository <a href="http://unicode.org/cldr/">
project page</a> [<a href="#localeProject">LocaleProject</a>].</p>
<h2>2. <a name="Locale">What is a locale?</a></h2>
<p>Before diving into the XML structure, it is helpful to describe the model behind the structure.
People do not have to subscribe to this model to use the data, but they do need to understand it
so that the data can be correctly translated into whatever model their implementation uses.</p>
<p>The first issue is basic: <i>what is a locale?</i> In this model, a locale is an
identifier (id) that
refers to a set of user preferences that tend to be shared across significant swaths of the world.
Traditionally, the data associated with this id provides support for formatting and parsing of
dates, times, numbers, and currencies; for measurement units, for sort-order (collation), plus
translated names for timezones, languages, countries, and scripts. They can also include text
boundaries (character, word, line, and sentence), text transformations (including
transliterations), and support for other services.</p>
<p>Locale data is not cast in stone: the data used on someone's machine generally may reflect the
US format, for example, but preferences can typically set to override particular items, such as
setting the date format for 2002.03.15, or using metric or Imperial measurement units. In the
abstract, locales are simply one of many sets of preferences that, say, a website may want to
remember for a particular user. Depending on the application, it may want to also remember the
user's timezone, preferred currency, preferred character set, smoker/non-smoker preference, meal
preference (vegetarian, kosher, etc.), music preference, religion, party affiliation, favorite
charity, etc.</p>
<p>Locale data in a system may also change over time: country boundaries change; governments (and
currencies) come and go: committees impose new standards; bugs are found and fixed in the source
data; and so on. Thus the data needs to be versioned for stability over time.</p>
<p>In general terms, the locale id is a parameter that is supplied to a particular service (date
formatting, sorting, spell-checking, etc.). The format in this document does not attempt to
represent all the data that could conceivably be used by all possible services. Instead, it
collects together data that is in common use in systems and internationalization libraries for
basic services. The main difference among locales is in terms of language; there may also be some
differences according to different countries or regions. However, the line between <i>locales</i>
and <i>languages</i>, as commonly used in the industry, are rather fuzzy. <span>Note also that the
vast majority of the locale data in CLDR is in fact language data<span class="changed">;
all non-linguistic data is separated out into a separate tree</span>. </span>For more information,
see <a href="#Language_and_Locale_IDs">Appendix D: Language and Locale IDs</a>.</p>
<p>We will speak of data as being "in locale X". That does not imply that a locale <i>is</i> a
collection of data; it is simply shorthand for "the set of data associated with the locale id X".
Each individual piece of data is called a <i>resource </i>
<span class="changed">or <i>field</i></span>, and a tag indicating the key of
resource is called a <i>resource tag.</i></p>
<h2>3. <a name="Identifiers">Identifiers</a></h2>
<p><span class="changed">LDML uses stable identifiers for distinguishing among
locales, regions, currencies, timezones, transforms, and so on. <span>Within
each type of entity, such as locales or such as currencies, the identifiers
are unique. However, across types the identifiers may not be unique: thus a
currency identifier may be the same as a locale identifier (especially since
identifiers are compared caselessly).</span></span></p>
<p><span class="changed"><span>There are many systems for identifiers for
these entities. The LDML identifiers may not match the identifiers used on a particular
target system. If so, some process of identifier translation may be required
when using LDML data.</span></span></p>
<p>An <span class="changed"><span>LDML </span></span>locale identifier has the following format:</p>
<blockquote>
<p><code><i>locale_id</i> := <i>base_locale_id</i> <i>options</i>?</code></p>
<p><code><i>base_locale_id</i> := <span class="changed">
extended_RFC3066bis_identifiers</span></code></p>
<p><code><i>options</i> := "@" <i>key</i> "=" <i>type</i> ("," <i>key</i> "=" <i>type</i> )*</code></p>
</blockquote>
<p>As usual, x? means that x is optional; x* means that x occurs zero or more times.</p>
<p><span class="changedspan">A locale ID is an extension of a language ID, and thus
the</span><span><span class="changedspan"> structure and field values are
based on </span>the successor to RFC 3066<span class="changedspan">, known
as <i>RFC3066bis</i>, which has as been approved, but not yet published.
However, the registry of data for that successor is now being maintained by
IANA. For that registry, and the editor's draft of the standard, see</span>
</span>[<a href="#RFC3066bis">RFC3066bis</a>]. <span class="changedspan">
The canonical form of a locale ID uses "_" instead of the "-" used in
RFC3066bis; however, implementations providing APIs for CLDR locale IDs should treat "-" as
equivalent to "_" on input. </span><span class="changed">The most
common format for the base_locale_id is a series of one or more fields of
the form:</span></p>
<p><code><i>language_code </i>("_" <i>script_code)? </i>("_" <i>
territory_code)? </i>("_" <i>variant_code</i>)?</code></p>
<p>The field values are given in the following table. All field values are case-insensitive,
except for the <i>type</i>, which is case-sensitive. However, customarily the language code is
lowercase, the territory and variant codes are uppercase, the script code is titlecase (that is,
first character uppercase and other characters lowercase), <span>and variants are uppercase. This
convention is used in the file names, which may be case-sensitive depending on the operating
system. Customarily the currency IDs are uppercase and timezone IDs are titlecase by field (as
defined in the timezone database); other key and type codes are lowercase</span>. <span>The <i>
type</i> may also be referred to as a <i>key-value</i>, for clarity.</span></p>
<p><span class="changedspan">Note that some private use field values may be
given specific values when used with LDML.</span></p>
<table>
<caption>Locale Field Definitions</caption>
<tr>
<th>Field</th>
<th>Allowable Characters</th>
<th>Allowable values</th>
</tr>
<tr>
<td><i>language_code</i></td>
<td>ASCII letters</td>
<td><span class="changedspan">[<a href="#RFC3066bis">RFC3066bis</a>]
subtag values marked as <b>Type: language</b></span><p>
<span class="changedspan"><b>Extensions:</b> In some exceptional cases,
draft [<a href="#ISO639">ISO639</a>] codes may be used in CLDR, if in
the judgment of the technical committee they are essentually assured of
being added. These currently include:</span></p>
<table border="1" width="100%" cellspacing="0" cellpadding="4" style="margin-top: 0.5li; margin-bottom: 0.5li" id="table1">
<tr>
<td><span class="changedspan">cch</span></td>
<td><span class="changedspan">Atsam</span></td>
</tr>
<tr>
<td><span class="changedspan">kaj</span></td>
<td><span class="changedspan">Jju</span></td>
</tr>
<tr>
<td><span class="changedspan">kcg</span></td>
<td><span class="changedspan">Tyap</span></td>
</tr>
<tr>
<td><span class="changedspan">kfo</span></td>
<td><span class="changedspan">Koro</span></td>
</tr>
</table>
<p><i><span class="changedspan">Users should however be aware that if
these codes are not accepted into [<a href="#RFC3066bis">RFC3066bis</a>],
that they will be replaced by whatever codes are used, or by private use
codes.</span></i></td>
</tr>
<tr>
<td height="191"><i>script_code</i></td>
<td height="191">ASCII letters</td>
<td height="191"><span class="changedspan">[<a href="#RFC3066bis">RFC3066bis</a>]
subtag values marked as <b>Type: script</b></span><p>In most cases the script is not
necessary, since the language is only customarily written in a single script. Examples of
cases where it is used are: </p>
<table border="1" width="100%" cellspacing="0" cellpadding="4">
<tr>
<td>az_Arab</td>
<td>Azerbaijani in Arabic script</td>
</tr>
<tr>
<td>az_Cyrl</td>
<td>Azerbaijani in Cyrillic script</td>
</tr>
<tr>
<td>az_Latn</td>
<td>Azerbaijani in Latin script</td>
</tr>
<tr>
<td>zh_Hans</td>
<td>Chinese, in simplified script</td>
</tr>
<tr>
<td>zh_Hant</td>
<td>Chinese, in traditional script</td>
</tr>
</table>
</td>
</tr>
<tr>
<td><i>territory_code</i></td>
<td>ASCII letters, numbers</td>
<td><span class="changedspan">[<a href="#RFC3066bis">RFC3066bis</a>]
subtag values marked as <b>Type: region</b>, or any UN M.49 code that
doesn't correspond to a [<a href="#RFC3066bis">RFC3066bis</a>]
region subtag.</span><p><span class="changedspan">There are three
private use codes defined in LDML:</span></p>
<table border="1" width="100%" cellspacing="0" cellpadding="4" style="margin-top: 0.5li; margin-bottom: 0.5li" id="table2">
<tr>
<td><span class="changedspan">QO</span></td>
<td><span class="changedspan">Outlying Oceania</span></td>
</tr>
<tr>
<td><span class="changedspan">QU</span></td>
<td><span class="changedspan">European Union</span></td>
</tr>
<tr>
<td><span class="changedspan">ZZ</span></td>
<td><span class="changedspan">Unknown or Invalid Territory</span></td>
</tr>
</table>
</td>
</tr>
<tr>
<td><i>variant_code</i></td>
<td>ASCII letters</td>
<td rowspan="3"><span><i>Values used in CLDR are <span class="changed">
discussed below</span>.</i><b> </b><i>For information
on the process for adding new standard variants or element/type pairs, see [<a href="#localeProject">LocaleProject</a>].</i></span></td>
</tr>
<tr>
<td><i>key</i></td>
<td>ASCII letters and digits</td>
</tr>
<tr>
<td><i>type</i></td>
<td>ASCII letters, digits, and "-"</td>
</tr>
</table>
<p><i>Examples:</i></p>
<blockquote>
<pre>en
fr_BE
de_DE@collation=phonebook,currency=<span>DDM</span></pre>
</blockquote>
<p>The locale id format generally follows the description in the <i>OpenI18N Locale Naming
Guideline</i> [<a href="#NamingGuideline">NamingGuideline</a>], with some enhancements. The main
differences from the those guidelines are that the locale id:</p>
<ol type="a">
<li>does not include a charset (since the data in <span>LDML format always provides a
representation of all Unicode characters. The repository is stored in UTF-8, although that can
be transcoded to other encodings as well.</span>),</li>
<li>adds the ability to have a variant, as in Java</li>
<li>adds the ability to discriminate the written language by script (or script variant).</li>
<li>is a superset of [<a href="#RFC3066bis">RFC3066bis</a>] codes.</li>
</ol>
<p class="note"><b>Note:</b> The language + script + territory code combination can itself be
considered simply a language code: For more information, see <i> <a href="#Language_and_Locale_IDs">
Appendix D: Language and Locale IDs</a></i>.</p>
<p>A locale that only has a language code (and possibly a script code) is called a <i>language
locale</i>; one with <span class="changed">both language and</span> territory code
<span class="changed">as well </span>is called a <i>territory locale</i> (or <i>
country locale</i>).</p>
<p>The variant codes specify particular variants of the locale, typically with special options.
They cannot overlap with script or territory codes, so they must have either one letter or have
more than 4 letters. The currently defined variants include:</p>
<center>
<table style="border-collapse: collapse" cellpadding="0" cellspacing="0">
<caption>Variant Definitions</caption>
<tr>
<th>variant</th>
<th>Description</th>
</tr>
<tr>
<td><span><RFC 3066bis variants></span></td>
<td><span class="changedspan"><span>As defined in </span>[<a href="#RFC3066bis">RFC3066bis</a>],
plus:</span></td>
</tr>
<tr>
<td>BOKMAL</td>
<td>Bokmål, variant of Norwegian <span>(deprecated: use nb)</span></td>
</tr>
<tr>
<td>NYNORSK</td>
<td>Nynorsk, variant of Norwegian <span>(deprecated: use nn)</span></td>
</tr>
<tr>
<td>AALAND</td>
<td>Åland, variant of Swedish used in Finland <span>(deprecated: use AX) </span></td>
</tr>
<tr>
<td><span>POSIX</span></td>
<td><span>A POSIX-style invariant locale.</span></td>
</tr>
<tr>
<td><span>REVISED</span></td>
<td><span>For revised orthography</span></td>
</tr>
<tr>
<td>SAAHO</td>
<td>The Saaho variant of <b>Afar</b></td>
</tr>
</table>
</center>
<p><b>Note: </b>The first two of the above variants are for backwards compatibility. Typically the
entire contents of these are defined by an <alias> element pointing at nb_NO (Norwegian Bokmål)
and nn_NO(Norwegian Nynorsk) locale IDs.<span> See also <i> <a href="#valid_attribute_values">Appendix
K: Valid Attribute Values</a></i>.</span></p>
<p><span class="changedspan">The locale IDs corresponding to grandfathered [<a href="#RFC3066bis">RFC3066bis</a>]
language tags are permitted, but not recommended.</span></p>
<p>The currently defined optional key/type combinations include the following. <span>Additional
type values are defined in the detail sections of this document or in </span><span>
<a href="#valid_attribute_values">Appendix K: Valid Attribute Values</a>.
<span class="changedspan">The assignment of values needs to ensure that they
are unique if truncated to 8 letters.</span></span></p>
<table>
<caption>Key/Type Definitions</caption>
<tr>
<th>key</th>
<th>type</th>
<th>Description</th>
</tr>
<tr>
<td rowspan="8">collation</td>
<td>phonebook</td>
<td>For a phonebook-style ordering (used in German).</td>
</tr>
<tr>
<td>pinyin</td>
<td>Pinyin ordering <span>for Latin and</span> for CJK characters <span>(that is, an ordering
for CJK characters based on a character-by-character transliteration into a pinyin)</span></td>
</tr>
<tr>
<td>traditional</td>
<td>For a traditional-style sort (as in Spanish)</td>
</tr>
<tr>
<td>stroke</td>
<td><span>Pinyin ordering for Latin, </span>stroke order for CJK characters</td>
</tr>
<tr>
<td>direct</td>
<td>Hindi variant</td>
</tr>
<tr>
<td>posix</td>
<td>A "C"-based locale.</td>
</tr>
<tr>
<td><span>big5han</span></td>
<td><span>Pinyin ordering for Latin, big5 charset ordering for CJK characters.</span></td>
</tr>
<tr>
<td><span>gb2312han</span></td>
<td><span>Pinyin ordering for Latin, gb2312han charset ordering for CJK characters.</span></td>
</tr>
<tr>
<td rowspan="10">calendar*</td>
<td>gregorian</td>
<td>(default)</td>
</tr>
<tr>
<td><span>islamic</span>
<p><span><i>alias:</i> </span>arabic</td>
<td>Astronomical Arabic</td>
</tr>
<tr>
<td>chinese</td>
<td>Traditional Chinese calendar</td>
</tr>
<tr>
<td><span>islamic-civil</span>
<p><span><i>alias:</i> </span>civil-arabic</td>
<td>Civil (algorithmic) Arabic calendar</td>
</tr>
<tr>
<td>hebrew</td>
<td>Traditional Hebrew Calendar</td>
</tr>
<tr>
<td>japanese</td>
<td>Imperial Calendar (same as Gregorian except for the year, with one era for each Emperor)</td>
</tr>
<tr>
<td><span>buddhist</span>
<p><span><i>alias:</i></span> thai-buddhist</td>
<td>Thai Buddhist Calendar (same as Gregorian except for the year)</td>
</tr>
<tr>
<td><span>persian</span></td>
<td><span>Persian Calendar</span></td>
</tr>
<tr>
<td><span><span>coptic</span></span></td>
<td><span>Coptic Calendar</span></td>
</tr>
<tr>
<td><span class="changedspan">ethiopic</span></td>
<td><span class="changedspan">Ethiopic Calendar</span></td>
</tr>
<tr>
<td colspan="2"><span>*For information on the calendar algorithms associated with the data
used with these types, see [<a href="#Calendars">Calendars</a>].</span></td>
</tr>
<tr>
<td><span class="changedspan"><i>collation parameters:</i></span><blockquote>
<p><span class="changedspan">colStrength<br>colAlternate<br>colBackwards<br>colNormalization<br>colCaseLevel<br>colCaseFirst,<br>colHiraganaQuaternary<br>colNumeric<br>
variableTop</span></p>
</blockquote>
</td>
<td><span class="changedspan"><i>associated values as defined in: 5.13.1 <a href="#<collation>"><collation></a></i></span></td>
<td><i><span class="changedspan">semantics as defined in: 5.13.1 <a href="#<collation>"><collation></a></span></i></td>
</tr>
<tr>
<td>currency</td>
<td>ISO 4217 code</td>
<td>Currency value identified by ISO code<span>, plus others in common use</span>. See <span>
<a href="#valid_attribute_values">Appendix K: Valid Attribute Values</a> and </span>also [<a href="#DataFormats">Data
Formats</a>]</td>
</tr>
<tr>
<td>timezone</td>
<td><i><span class="changedspan">TZ</span>ID</i></td>
<td>Identification for timezone according to the <i><span class="changedspan">TZ </span></i>
Database. See [<a href="#DataFormats">Data Formats</a>].</td>
</tr>
</table>
<p class="note"><span>For more information on the allowed attribute values, see the specific
elements below, and <a href="#valid_attribute_values">Appendix K: Valid Attribute Values</a>.</span></p>
<p><span class="changedspan">CLDR Locale IDs can be converted to valid <span>
RFC 3066bis language tags by performing the following transformation.</span></span></p>
<ul>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span class="changedspan"><span>Convert any deprecated codes into the
regular equivalents (thus BOKMAL is replaced by nb).</span></span></li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span class="changedspan"><span>Convert the "_" separators into "-"</span></span></li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span class="changedspan"><span style="background-color: #FFFF00">Insert
"x-ldml-" in front of the first field that cannot be validly represented in RFC 3066bis</span></span></li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span class="changedspan"><span style="background-color: #FFFF00">Remove
any non-alphanumerics from identifiers (thus islamic-civil is replaced
by islamiccivil).</span></span></li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span class="changedspan">If there are any keyword-type pairs, insert
<span style="background-color: #FFFF00">"x-ldml-" if not already present</span>,
replace "@" and "," by "-k-", and change "=" to "-"</span></li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span class="changedspan">Truncate the lengths of each subtag to 8
characters</span></li>
</ul>
<p><span class="changedspan">Thus for example, we get the following
conversion:</span></p>
<table border="1" cellspacing="0" cellpadding="4" style="margin-top: 0.5li; margin-bottom: 0.5li" id="table3">
<tr>
<td><span class="changedspan"><span style="background-color: #FFFF00">
CLDR</span></span></td>
<td><span class="changedspan">en_US_POSIX@calendar=islamic,collation=traditional,colStrength=secondary</span></td>
</tr>
<tr>
<td><span class="changedspan"><span style="background-color: #FFFF00">
RFC3066bis</span></span></td>
<td><span class="changedspan">
en-US-x-<span style="background-color: #FFFF00">ldml</span>-POSIX-k-calendar-islamic-k-collation-traditio-k-colStren-secondar</span></td>
</tr>
</table>
<p class="note"> </p>
<h3><span class="changedspan">3.1 <a name="Unknown_or_Invalid_Identifiers">
Unknown or Invalid Identifiers</a></span></h3>
<p><span class="changedspan">The following identifiers are used to indicate
an unknown or invalid code in CLDR. The Region and Timezone code are
additional codes provided by CLDR; the others are defined by the relevant
standards. When these codes are used in APIs connected with CLDR, the
meaning is that either there was no identifier available, or that at some
point an input identifier value was determined to be invalid or ill-formed.</span></p>
<table border="1" cellspacing="0" cellpadding="4" style="margin-top: 0.5li; margin-bottom: 0.5li" id="table4">
<tr>
<th><span class="changedspan">Code Type</span></th>
<th><span class="changedspan">Value</span></th>
<th><span class="changedspan">Description</span></th>
</tr>
<tr>
<td><span class="changedspan">Language </span></td>
<td><span class="changedspan">und</span></td>
<td><span class="changedspan">Undetermined language</span></td>
</tr>
<tr>
<td><span class="changedspan">Script</span></td>
<td><span class="changedspan">Zyyy</span></td>
<td><span class="changedspan">Code for undetermined script</span></td>
</tr>
<tr>
<td><span class="changedspan">Region </span></td>
<td><span class="changedspan">ZZ</span></td>
<td><span class="changedspan">Unknown or Invalid Territory</span></td>
</tr>
<tr>
<td><span class="changedspan">Currency</span></td>
<td><span class="changedspan">XXX</span></td>
<td><span class="changedspan">The codes assigned for transactions
where no currency is involved</span></td>
</tr>
<tr>
<td><span class="changedspan">Timezone</span></td>
<td><span class="changedspan">Etc/Unknown</span></td>
<td><span class="changedspan">Unknown or Invalid Timezone</span></td>
</tr>
</table>
<p><span class="changed">When only the script or region are known, then a
locale ID will use "und" as the language subtag portion. Thus the locale tag
"und-Grek" represents the Greek script; "und-US" represents the US
territory.</span></p>
<h2>4. <a name="Locale_Inheritance">Locale Inheritance</a></h2>
<p>The XML format relies on an inheritance model, whereby the resources are collected into <i>
bundles</i>, and the bundles organized into a tree. Data for the many Spanish locales does not
need to be duplicated across all of the countries having Spanish as a national language. Instead,
common data is collected in the Spanish language locale, and territory locales only need to supply
differences. The parent of all of the language locales is a generic locale known as <i>root</i>.
Wherever possible, the resources in the root are language & territory neutral. For example, the
collation (sorting) order in the root is the default Unicode Collation
Algorithm order (see [<a href="#UCA">UCA</a>]). Since English language collation has the
same ordering, the 'en' locale data does not need to supply any collation data, nor does either
the 'en_US' or the 'en_IE' locale data.</p>
<p>Given a particular locale id "en_US_someVariant", the search chain for a particular resource is
the following.</p>
<blockquote>
<pre>en_US_someVariant
en_US
en
root</pre>
</blockquote>
<p>If a type and key are supplied in the locale id, then logically the chain from that id to the
root is searched for a resource tag with a given type, all the way up to root. If no resource is
found with that tag and type, then the chain is searched again without the type.</p>
<p>Thus the data for any given locale will only contain resources that are different from the
parent locale. For example, most territory locales will inherit the bulk of their data from the
language locale: "en" will contain the bulk of the data: "en_US" will only contain a few items
like currency. All data that is inherited from a parent is presumed to be valid, just as valid as
if it were physically present in the file. This provides for much smaller resource bundles, and
much simpler (and less error-prone) maintenance.</p>
<p>Where this inheritance relationship does not match a target system, such as POSIX, the data
logically should be fully resolved in converting to a format for use by that system, by adding <i>
all</i> inherited data to each locale data set.</p>
<p>For a more complete description of how inheritance applies to data, and the use of keywords,
see <a href="#Inheritance_and_Validity">Inheritance_and_Validity</a>.</p>
<p>The locale data does not contain general character properties that are derived from the <i>
Unicode Character Database</i> [<a href="ftp://ftp.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html">UCD</a>].
That data being common across locales, it is not duplicated in the bundles. Constructing a POSIX
locale from the following data requires use of <span>UCD</span> data. In addition, POSIX locales
may also specify the character encoding, which requires the data to be transformed into that
target encoding.</p>
<p><span class="changedspan"><b>Warning: </b>If a locale has a different script than its parent
(eg sr_Latn), then special attention must be paid to make sure that all inheritance is covered.
For example, auxiliary exemplar characters may need to be empty ("[]") to block inheritance.</span></p>
<h3>4.1 <a name="Multiple_Inheritance">Multiple Inheritance</a></h3>
<p>In clearly specified instances, resources may inherit from within the same locale. For example,
currency format symbols inherit from the number format symbols; the Buddhist calendar inherits
from the Gregorian calendar. This <i>only</i> happens where documented in this specification. In
these special cases, the inheritance <span>functions as normal, up to the root. If the data is not
found along that path, then a second search is made, logically changing the element/attribute to
the alternate values.</span></p>
<p><span>For example, for the locale "en_US" the month data in <calendar class="<span style="color: blue">buddhist</span>">
inherits first from <calendar class="<span style="color: blue">buddhist</span>"> in "en", then in
"root". If not found there, then it inherits from <calendar type="<span style="color: blue">gregorian</span>">
in "en_US", then "en", then in "root".</span></p>
<h2>5 <a name="XML_Format">XML Format</a></h2>
<p><span class="changedspan">There are two kinds of data that can be expressed
in LDML: language-dependent data and supplementary data. In either case,
data can be split across multiple files, which can be in multiple directory trees.</span></p>
<p><span class="changedspan">For example, the language-dependent data for
Japanese in CLDR is present in the following files:</span></p>
<ul>
<li><span class="changedspan">common/collation/ja.xml</span></li>
<li><span class="changedspan">common/main/ja.xml</span></li>
<li><span class="changedspan">common/segmentations/ja.xml</span></li>
</ul>
<p><span class="changedspan">The status of the data is the same, whether or
not data is split. That is, for the purpose of validation and lookup, all of
the data for the above ja.xml files is treated as if it was in a single
file.</span></p>
<p><span class="changedspan">Supplemental data relating to Japan or the
Japanese writing system can be found in:</span></p>
<ul>
<li><span class="changedspan">common/supplemental/supplementalData.xml</span></li>
<li><span class="changedspan">common/transforms/Hiragana-Katakana.xml</span></li>
<li><span class="changedspan">common/transforms/Hiragana-Latin.xml</span></li>
<li><span class="changedspan">...</span></li>
</ul>
<p>The following sections describe the structure of the XML format for<span class="changed">
language-dependent </span>data. <span>The more
precise syntax is in the </span><span>DTD</span><span>, listed at the top of this document<span class="changed"><i>;
however, the DTD does not describe all the constraints on the structure.</i></span></span></p>
<p>To start with, the root element is <ldml>, with the following DTD entry:</p>
<p class="example"><span class="dtd"><!ELEMENT ldml (identity, (alias |(localeDisplayNames?,
layout?, characters?, delimiters?, measurement?, dates?, numbers?, collations?, posix?,
special*))) ></span></p>
<p>That element contains the following elements:</p>
<ul>
<li><a href="#<identity>"><identity></a></li>
<li><a href="#<localeDisplayNames>"><localeDisplayNames></a></li>
<li><a href="#<layout>"><layout></a></li>
<li><a href="#<characters>"><characters></a></li>
<li><a href="#<delimiters>"><delimiters></a></li>
<li><a href="#<measurement>"><measurement></a></li>
<li><a href="#<dates>"><dates></a> </li>
<li><a href="#<numbers>"><numbers></a> </li>
<li><a href="#<collations>"><collations></a> </li>
<li><a href="#<posix>"><posix></a></li>
</ul>
<p>The structure of each of these elements and their contents will be described below. The first
few elements have little structure, while dates, numbers, and collations are more involved.</p>
<p><span class="changed">The XML structure is stable over releases. Elements
and attributes may be deprecated: they are retained in the DTD but their
usage is strongly discouraged. In most cases, an alternate structure is
provided for expressing the information.</span></p>
<p>In general, all translatable text in this format is in element contents, while attributes are
reserved for types and non-translated information (such as numbers or dates). The reason that
attributes are not used for translatable text is that spaces are not preserved, and we cannot
predict where spaces may be significant in translated material.</p>
<p><span>There are two kinds of elements in LDML: <i>rule</i> elements and <i>structure</i>
elements. For structure elements, there are restrictions to allow for effective inheritance and
processing:</span></p>
<ol>
<li><span>There is no "mixed" content: if an element has textual content, then it cannot contain
any elements.</span></li>
<li><span>The XPath leading to the content is unique; no two different pieces of textual content
have the same XPath.</span></li>
</ol>
<p><span>Structure elements do not have this restriction, but also do not inherit, except as an
entire block. <span class="changed">The structure elements are listed in
serialElements in the supplemental metadata. </span>See also <a href="#Inheritance_and_Validity">Appendix I:
Inheritance and Validity</a>.</span></p>
<p>Note that the data in examples given below is purely illustrative, and doesn't match any
particular language. For a more detailed example of this format, see [<a href="#LDML">Example</a>].
There is also a DTD for this format, but <i>remember that the DTD alone is not sufficient to
understand the semantics, the constraints, nor the interrelationships between the different
elements and attributes</i>. You may wish to have copies of each of these to hand as you proceed
through the rest of this document.</p>
<p><span>In particular, all elements allow for draft versions to coexist in the file at the same
time. Thus <span class="changed">most</span> elements are marked
in the DTD as allowing multiple instances. However, unless an element is
listed as a serialElement, or has a distinguishing attribute, it can only
occur once as a subelement of a given element. <span class="changed">Thus,
for example, the following is illegal even though allowed by the DTD:</span></span></p>
<p><span class="changed"><languages><br>
<language type="aa">...</language><br>
<language type="aa">..</language></span></p>
<p><span><span class="removedspan">These are:</span></span></p>
<ul>
<li><span class="removedspan"><span>exemplarCharacters (can occur twice), with two different type values)</span></span></li>
<li><span class="removedspan"><span>quotationStart, quotationEnd, alternateQuotationStart, alternateQuotationEnd,</span></span></li>
<li><span class="removedspan"><span>am, pm</span></span></li>
<li><span class="removedspan"><span>measurementSystem, paperSize</span></span></li>
<li><span class="removedspan"><span>localizedPatternChars</span></span></li>
<li><span class="removedspan"><span>pattern</span></span></li>
</ul>
<p><span>There must be only one instance of these per parent<span class="changed">,
unless there are other distinguishing attributes (such as an alt element)
</span><span class="removedspan">that doesn't have an alternate attribute</span>.</span></p>
<p><span class="changedspan">In general, data should be in NFC format.
Exceptions to this include transforms, segmentations, and pc/sc/tc/qc/ic
rules in collation. Thus LDML documents must not be normalized as a whole.
To prevent problems with normalization, no element value can start with a combining
backslash.</span></p>
<p><span class="changedspan">Lists, such as </span><span class="attribute">
singleCountries</span><span class="changedspan"> are space-delimited. That
means that they are separated by one or more XML whitespace characters, and
that leading and trailing spaces are to be ignored (that is, they behave
like NMTOKENS). These include:</span></p>
<ul>
<li><span class="changedspan">singleCountries</span></li>
<li><span class="changedspan">preferenceOrdering</span></li>
<li><span class="changedspan">references</span></li>
<li><span class="changedspan">validSubLocales</span></li>
</ul>
<h3>5.1 <a name="Common_Elements">Common Elements</a></h3>
<p>At any level in any element, two special elements are allowed.</p>
<p class="element2"><<a name="special">special</a> xmlns:yyy="<span style="color: blue">xxx</span>"></p>
<p>This element is designed to allow for arbitrary additional annotation and data that is
product-specific. It has one required attribute, which specifies the XML
<a href="http://www.w3.org/TR/REC-xml-names/">namespace</a> of the special data. For example, the
following used the version 1.0 POSIX special element.</p>
<pre><!DOCTYPE ldml SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.0/ldml.dtd</span>" [
<!ENTITY % posix SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.0/ldmlPOSIX.dtd</span>">
<span style="color: blue">%<span>posix</span>;</span>
]>
<ldml>
...
<special xmlns:posix="<span style="color: blue">http://www.opengroup.org/regproducts/xu.htm</span>">
<span style="color: green"><!-- old abbreviations for pre-GUI days --></span>
<posix:messages>
<posix:yesstr><span style="color: blue">Yes</span></posix:yesstr>
<posix:nostr><span style="color: blue">No</span></posix:nostr>
<posix:yesexpr><span style="color: blue">^[Yy].*</span></posix:yesexpr>
<posix:noexpr><span style="color: blue">^[Nn].*</span></posix:noexpr>
</posix:messages>
</special>
</ldml></pre>
<p class="element2"><b><<a name="alias">alias</a> source="</b><span style="color: blue"><locale_ID></span><b>"<span>
<span>path="..."</span>/</span>></b></p>
<p>The contents of any element can be replaced by an alias, which points to another source for the
data. The elements in that source are to be fetched from the corresponding location in the other
source. Normal resource searching is to be used; take the following example:</p>
<pre><ldml>
<collations>
<collation type="<span style="color: blue">phonebook</span>">
<alias source="<span style="color: blue">de_DE</span>">
</collation>
</collation<span>s</span>>
</ldml></pre>
<p>The resource bundle at "de_DE" will be searched for a resource element at the same position in
the tree with type "collation". If not found there, then the resource bundle at "de" will be
searched, etc. <span>For an example of how this works with inheritance, look at the following
table (where </span><span class="inherited">green</span><span> indicates
inherited items). <i>Note in particular that an alias "reroutes" the inheritance; nothing in the
parent affects the contents of an item with an alias. Thus the
</i></span><span class="blockedInherited">red</span><span><i> item below is blocked.</i></span></p>
<table border="1" cellpadding="0" cellspacing="1" class="noborder">
<caption><span>Inheritance with Aliases</span></caption>
<tr>
<th width="20%"><span>en</span></th>
<th width="20%"><span>en_US</span></th>
<th width="20%" bgcolor="#C0C0C0"><span>Resolved</span></th>
</tr>
<tr>
<td width="20%"><span><code><x><br>
<a>01</a><br>
<b>02</a><br>
<c>03</a><br>
</x></code></span></td>
<td width="20%"><span><code><x><br>
<br>
<b>12</b><br>
<br>
</x></code></span></td>
<td width="20%" bgcolor="#C0C0C0"><span><code><x><br>
<span class="inherited"> <span style="font-weight: 700; "><a>01</a></span></span><br>
<b>12</b><br>
<span class="inherited"> <span style="font-weight: 700; "><c>03</c></span></span><br>
</x></code></span></td>
</tr>
<tr>
<th width="20%"><span>de</span></th>
<th width="20%"><span>de_DE</span></th>
<th width="20%" bgcolor="#C0C0C0"><span>Resolved</span></th>
<th width="20%"><span>de_DE_1901</span></th>
<th width="20%" bgcolor="#C0C0C0"><span>Resolved</span></th>
</tr>
<tr>
<td width="20%"><span><code><x><br>
<a>21</a><br>
<b>22</b><br>
<c>23</c><br>
<span class="blockedInherited"> <span style="font-weight: 700; "><d>23</d></span></span><br>
</x></code></span></td>
<td width="20%"><span><code><x><br>
<alias source="en_US"><br>
</x></code></span></td>
<td width="20%" bgcolor="#C0C0C0"><span><code><x><br>
<a>01</a><br>
<b>12</b><br>
<c>03</c><br>
</x></code></span></td>
<td width="20%"><span><code><x><br>
<a>41</a><br>
<br>
<br>
<br>
</x></code></span></td>
<td width="20%" bgcolor="#C0C0C0"><span><code><x><br>
<a>41</a><br>
<span class="inherited"> <span style="font-weight: 700; "><b>12</b></span></span><br>
<span class="inherited"> <span style="font-weight: 700; "><c>03</c></span></span><br>
</x></code></span></td>
</tr>
</table>
<p><span>If the <b>path</b> attribute is present, then its value is an XPath that points to a
different node in the tree. For example:</span></p>
<pre><span><alias source="root" path="../monthWidth[@type='wide']"/></span></pre>
<p><span>The default value if the path is not present is the same position in the tree. </span>
<span><span>All of the attributes in the XPath must be <i>distinguishing</i> elements. </span>
</span>For more details, see <a href="#Inheritance_and_Validity">Appendix I: Inheritance and
Validity</a>.</p>
<p><span>There is a special value for the source attribute, the constant <b>source="locale"</b>,
which is the default value. This special value is equivalent to the locale being resolved. For
example, consider the following example, where locale data for 'de' is being resolved:</span></p>
<div align="center">
<center>
<table border="1" cellpadding="0" cellspacing="1">
<caption><span>Inheritance with source="locale"</span></caption>
<tr>
<th width="33%"><span>Root</span></th>
<th width="33%"><span>de</span></th>
<th width="33%" bgcolor="#C0C0C0"><span>Resolved</span></th>
</tr>
<tr>
<td><span><code><x><br>
<a>1</a><br>
<b>2</b><br>
<c>3</c><br>
</x></code></span><code> </code></td>
<td><span><code><x><br>
<a>11</a><br>
<b>12</b><br>
<d>14</d><br>
</x></code></span><code> </code></td>
<td bgcolor="#C0C0C0"><span><code><x><br>
<a>11</a><br>
<b>12</b><br>
<span class="inherited"><span style="font-weight: 700; "><c>3</c></span></span><span style="background-color: #CC0000"><br>
</span> <d>14</d><br>
</x></code></span><code> </code></td>
</tr>
<tr>
<td><span><code><y><br>
<alias path="../x"><br>
</y></code></span><code> </code></td>
<td><span><code><y><br>
<b>22</b><br>
<e>25</e><br>
</y></code></span><code> </code></td>
<td bgcolor="#C0C0C0"><span><code><y><br>
<span class="inherited"><span style="font-weight: 700; "><a>11</a></span></span><span style="background-color: #CC0000"><br>
</span> <b>22</b><br>
<span class="inherited"><span style="font-weight: 700; "><c>3</c></span></span><span style="background-color: #CC0000"><br>
</span> <span class="inherited"><span style="font-weight: 700; "><d>14</d></span></span><span style="background-color: #CC0000"><br>
</span> <e>25</e><br>
</y></code></span><code> </code></td>
</tr>
</table>
</center>
</div>
<p><span><span>The first row shows the inheritance within the <x> element, whereby <c> is
inherited from root. The second shows the inheritance within the <y> element, whereby <a>, <c>,
and <d> are inherited also from root, but from an alias there. The alias in root is logically
replaced not by the elements in root itself, but by elements in the 'target' locale.</span></span></p>
<p>For more details <span>on data resolution</span>, see <a href="#Inheritance_and_Validity">
Appendix I: Inheritance and Validity</a>.</p>
<p><span>It is an error to have a circular chain of aliases. That is, a collection of LDML XML
documents must not have situations where a sequence of alias lookups (including inheritance and
multiple inheritance) can be followed indefinitely without terminating.</span></p>
<p class="element2"><displayName></p>
<p>Many elements can have a display name. This is a translated name that can be presented to users
when discussing the particular service. For example, a number format, used to format numbers using
the conventions of that locale, can have translated name for presentation in GUIs.</p>
<pre> <numberFormat>
<displayName><span style="color: blue">Prozentformat</span></displayName>
...
<numberFormat></pre>
<p><span>Where present, the display names must be unique; that is, two distinct code would not get
the same display name. </span><span class="changedspan">(There is one exception to this: in
timezones, where parsing results would give the same GMT offset, the standard and daylight display
names can be the same across different timezone IDs.) </span><span>Any translations should follow customary practice for the locale in
question. For more information, see [<a href="#DataFormats">Data Formats</a>].</span></p>
<p class="element2"><default type="<span style="color: blue">someID</span>"/></p>
<p>In some cases, a number of elements are present. The default element can be used to indicate
which of them is the default, in the absence of other information. The value of the type attribute
is to match the value of the type attribute for the selected item.</p>
<pre><span><timeFormats>
<default type="<span style="color: red">medium</span>" />
<timeFormatLength type="<span style="color: blue">full</span>">
<timeFormat type="<span style="color: blue">standard</span>">
<pattern type="<span style="color: blue">standard</span>"><span style="color: blue">h:mm:ss a z</span></pattern>
</timeFormat>
</timeFormatLength>
<timeFormatLength type="<span style="color: blue">long</span>">
<timeFormat type="<span style="color: blue">standard</span>">
<pattern type="<span style="color: blue">standard</span>"><span style="color: blue">h:mm:ss a z</span></pattern>
</timeFormat>
</timeFormatLength>
<timeFormatLength type="<span style="color: red">medium</span>">
<timeFormat type="<span style="color: blue">standard</span>">
<pattern type="<span style="color: blue">standard</span>"><span style="color: blue">h:mm:ss a</span></pattern>
</timeFormat>
</timeFormatLength>
...</span></pre>
<p>Like all other elements, the <default> element is inherited. Thus, it can also refer to
inherited resources. For example, suppose that the above resources are present in fr, and that in
fr_BE we have the following:</p>
<pre><span><timeFormats>
<default type="<span style="color: red">long</span>"/>
</timeFormats></span></pre>
<p>In that case, the default <span>time</span> format for fr_BE would be the inherited "<span>long</span>"
resource from fr. Now suppose that we had in fr_CA:</p>
<pre><span> <timeFormatLength type="<span style="color:red">medium</span>">
<timeFormat type="<span style="color: blue">standard</span>">
<pattern type="<span style="color: blue">standard</span>"><span style="color: blue">...</span></pattern>
</timeFormat>
</timeFormatLength>
</span></pre>
<p>In this case, the <default> is inherited from fr, and has the value "<span>medium</span>". It
thus refers to this new "<span>medium</span>" pattern in this resource bundle.</p>
<h4>5.1.1 <a name="Escaping_Characters">Escaping Characters</a></h4>
<p>Unfortunately, XML does not have the capability to contain all Unicode code points. Due to
this, <span class="changedspan">in certain instances</span> extra syntax is required to represent those code points that cannot be otherwise represented
in element content. <span class="removedspan">This also must be used where spaces are significant (otherwise they can be
stripped).</span></p>
<table>
<caption>Escaping Characters</caption>
<tr>
<th>Code Point</th>
<th>XML Example</th>
</tr>
<tr>
<td><code>U+0000</code></td>
<td><code><cp hex="0"></code></td>
</tr>
</table>
<p class="note"><span class="removedspan"><b>Note: </b>This is not necessary in XML 1.0 — except for NULL (U+0000), which is
typically never used. However, for backwards compatibility with XML 1.0 systems it is best for
some time to come to use these special escapes. </span> <span>These escapes are only allowed in certain
elements, according to the DTD.</span></p>
<h3>5.2 <a name="Common_Attributes">Common Attributes</a></h3>
<p class="element2"><... type="<span style="color: blue">stroke</span>" ...></p>
<p>The attribute <i>type</i> is also used to indicate an alternate resource that can be selected
with a matching type=option in the locale id modifiers, or be referenced by a default element. For
example:</p>
<pre><ldml>
...
<currencies>
<currency><span style="color: blue">...</span></currency>
<currency type="<span style="color: blue">preEuro</span>"><span style="color: blue">...</span></currency>
</currencies>
</ldml></pre>
<p class="element2"><... draft="<span class="changedspan"><span style="color: #0000FF">unconfirmed</span></span>" ...></p>
<p>If this attribute is present, it indicates the status of all the data in this element and any
subelements (unless they have a contrary <i>draft</i> value)<span class="changedspan">,
as per the following:</span></p>
<ul>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<span style="BACKGROUND-COLOR: #ffff00">
<i>approved:</i> approved
by the technical committee or an expert vetter (equals the CLDR 1.3
value of </span><i>
<span style="BACKGROUND-COLOR: #ffff00">false</span></i><span style="BACKGROUND-COLOR: #ffff00">,
or an absent <i>draft</i> attribute). This does not
mean that the data is guaranteed to be error-free --
this is the best judgment of the committee.</span>
</li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<i><span style="BACKGROUND-COLOR: #ffff00">
provisional</span></i><span style="BACKGROUND-COLOR: #ffff00">:
data entered and confirmed by at least one regular vetter. Implementations may choose to
accept the provisional data, especially if there is
no translated alternative.</span>
</li>
<li style="margin-top: 0.5li; margin-bottom: 0.5li">
<i><span style="BACKGROUND-COLOR: #ffff00">
unconfirmed</span></i><span style="BACKGROUND-COLOR: #ffff00">:
no confirmation available: entered by guest without
</span><span style="background-color: #ffff00">
Technical Committee </span>
<span style="BACKGROUND-COLOR: #ffff00">
confirmation; or downgraded from provisional because
of disagreement (equals CLDR 1.3 value of </span><i>
<span style="BACKGROUND-COLOR: #ffff00">true</span></i><span style="BACKGROUND-COLOR: #ffff00">)</span>
</li>
</ul>
<p><span class="changedspan">Normally draft </span><span>
<span class="changedspan">attributes should only occur on "leaf" elements.
</span>For a more formal description of how elements are
inherited, and what their draft status is, </span>see <a href="#Inheritance_and_Validity">
Inheritance_and_Validity</a>.</p>
<p><... <a name="alt_attribute">alt</a>="<span class="changedspan"><i>descriptor</i></span>" ...></p>
<p><span class="changedspan">This attribute labels an alternative value for an element. The <i>
descriptor</i> indicates what kind of alternative it is, and takes one of the following forms:
</span></p>
<ul>
<li><span class="changedspan"><i>variantname</i> meaning that the value is a variant of the
normal value, and may be used in its place in certain circumstances. If a variant value is absent for a particular locale, the normal value is used. The variant mechanism should only be used when such a fallback is acceptable.</span></li>
<li><span class="changedspan"><span style="color: blue">proposed</span>, optionally followed by
a number, indicating that the value is a proposed replacement for an existing value.</span></li>
<li><span class="changedspan"><i>variantname</i><span style="color: blue">-proposed</span>,
optionally followed by a number, indicating that the value is a proposed replacement variant
value.</span></li>
</ul>
<p><span class="changedspan">"<span style="color: blue">proposed</span>"</span> should only be
present if the draft <span class="changed">status is not "approved"</span>. It indicates that the data is proposed replacement data that has been
added provisionally until the differences between it and the other data can be vetted. For
example, suppose that the translation for September for some language is "Settembru", and a bug
report is filed that that should be "Settembro". The new data can be entered in, but marked as <i>
alt<span class="changedspan">="proposed"</span></i><span class="changedspan">
</span>until it is vetted. </p>
<pre>...
<month type="9">Settembru</month>
<month type="9" draft="<span class="changedspan">unconfirmed</span>" alt="proposed">Settembro</month>
<month type="10">...</pre>
<p><span class="changedspan">Now assume another bug report comes in, saying that the correct form
is actually "Settembre". Another alternative can be added: </span></p>
<pre><span class="changedspan">...
<month type="9" draft="unconfirmed" alt="proposed2">Settembre</month>
...</span></pre>
<p><span class="changedspan">The allowable values for <i>variantname</i> at this time are "<span style="color: blue">variant</span>",
"<span style="color: blue">list</span>", "<span style="color: blue">email</span>", "<span style="color: blue">www</span>", and "<span style="color : blue">secondary</span>". This may be expanded in the future.</span></p>
<p><... validSubLocales="de_AT de_CH de_DE" ...></p>
<p><span>The attribute </span><i><span>validSubLocales</span></i><span> allows sublocales in a
given tree to be treated as though a file for them were present when there isn't one. It can be
applied to any element. It only has an effect for locales that inherit from the current file where
a file is missing, and the elements wouldn't otherwise be draft.</span></p>
<p>For a more complete description of how draft applies to data, see
<a href="#Inheritance_and_Validity">Inheritance_and_Validity</a>.</p>
<p class="element2"><... standard="<span style="color: blue">...</span>" ...></p>
<p class="element2"><span><i>Note: this attribute is deprecated. Instead, use a reference element
with the attribute standard="true". See Section 5.12 <a href="#references_element"><references>.</a></i></span></p>
<p>The value of this attribute is a list of strings representing standards: international,
national, organization, or vendor standards. The presence of this attribute indicates that the
data in this element is compliant with the indicated standards. Where possible, for uniqueness,
the string should be a URL that represents that standard. The strings are separated by commas;
leading or trailing spaces on each string are not significant. Examples:</p>
<p><code><collation standard="<span style="color: blue">MSA 200:2002</span>"><br>
...<br>
<dateFormatStyle standard=”http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26780&amp;ICS1=1&amp;ICS2=140&amp;ICS3=30”></code></p>
<p><... <a name="references_attribute">references</a>="<span style="color: blue">...</span>" ...></p>
<p><span>The value of this attribute is a list of strings<span class="changedspan">, separated by spaces, each</span> representing a reference for the
information in the element, including standards that it may conform to. The best format is a
series of tokens, where each token corresponds to a reference element. See Section 5.12
<a href="#references_element"><references></a>. (In older versions of CLDR, the value of the
attribute was freeform text. That format is deprecated.)</span></p>
<p><i><span>Example:</span></i></p>
<p class="example"><span><territory type="UM" references="R1 R2">USAs yttre öar</territory></span></p>
<p><span>The reference element may be inherited. Thus<span class="changedspan">,</span> for example, <span class="removedspan">even if</span> R2 may be used in
sv_SE.xml even though it is not defined there, if it is defined in sv.xml.</span></p>
<hr width="50%">
<h3>5.3 <a name="<identity>"><identity></a></h3>
<p><span class="dtd"><!ELEMENT identity (alias | (version, generation, language, script?,
territory?, variant?, special*) ) ></span></p>
<p>The identity element contains information identifying the target locale for this data, and
general information about the version of this data.</p>
<p class="element2"><version number="<span>$Revision: 1.7 $</span>"></p>
<p>The version element provides, in an attribute, the version of this file. The contents of
the element can contain textual notes about the changes between this version and the last. For
example:</p>
<blockquote>
<pre><version number="<span style="color: blue">1.1</span>"><span style="color: blue">Various notes and changes in version 1.1</span></version></pre>
<p><span>This is not to be confused with the version attribute on the ldml element, which tracks
the dtd version.</span></p>
</blockquote>
<p class="element2"><generation date="<span>$Date: 2008/06/03 17:10:08 $</span>" /></p>
<p>The generation element contains the last modified date for the data.<span>
<span class="changedspan">This can be in two formats: ISO 8601 format, or CVS format (</span>illustrated
by the example above<span class="changedspan">)</span>.</span></p>
<p class="element2"><language type="<span style="color: blue">en</span>"/></p>
<p>The language code is the primary part of the specification of the locale id, with values as
described above.</p>
<p class="element2"><script type="<span style="color: blue">Latn</span>" /></p>
<p>The script field may be used in the identification of written languages, with values described
above.</p>
<p class="element2"><territory type="<span style="color: blue">US</span>"/></p>
<p>The territory code is a common part of the specification of the locale id, with values as
described above.</p>
<p class="element2"><variant type="<span class="attributeValue">NYNORSK</span>"/></p>
<p>The variant code is the tertiary part of the specification of the locale id, with values as
described above.</p>
<h3>5.4 <a name="<localeDisplayNames>"><localeDisplayNames></a></h3>
<p><span class="dtd"><!ELEMENT localeDisplayNames (alias | (languages?, scripts?, territories?,
variants?, keys?, types?, <span class="changedspan">measurementSystemNames?, </span>special*)) ></span></p>
<p>Display names for scripts, languages, countries, and variants in this locale are supplied by
this element. These supply localized names for these items for use in user-interfaces for
displaying lists of locales and scripts. Examples are given below. </p>
<p class="note"><span class="changedspan"><b>Note:</b> The "<span style="color: blue">en</span>"
locale may contain translated names for deprecated codes for debugging purposes. Translation of deprecated codes into other languages is discouraged.</span></p>
<p><span>Where present, the display names must be unique; that is, two distinct code would not get
the same display name. </span><span class="changedspan">(There is one exception to this: in
timezones, where parsing results would give the same GMT offset, the standard and daylight display
names can be the same across different timezone IDs.)</span></p>
<p><span>Any translations should follow customary practice for the locale in question. For more
information, see [<a href="#DataFormats">Data Formats</a>].</span></p>
<p class="element2"><languages></p>
<p>This contains a list of elements that provide the user-translated names for language codes<span class="removedspan"> from
[<a href="#ISO639">ISO639</a>]</span>, as described in <i>
<a href="#Identifiers">Section 3, Identifiers</a></i>.</p>
<blockquote>
<pre><language type="<span style="color: blue">ab</span>"><span style="color: blue">Abkhazian</span></language>
<language type="<span style="color: blue">aa</span>"><span style="color: blue">Afar</span></language>
<language type="<span style="color: blue">af</span>"><span style="color: blue">Afrikaans</span></language>
<language type="<span style="color: blue">sq</span>"><span style="color: blue">Albanian</span></language></pre>
</blockquote>
<p><span>The type can actually be any locale ID as specified above. The set of which locale IDs is
not fixed, and depends on the locale. For example, in one language one could translate the
following locale IDs, and in another, fall back on the normal composition.</span></p>
<table border="1" cellpadding="4" cellspacing="0">
<tr>
<th width="33%"><span>type</span></th>
<th width="33%"><span>translation</span></th>
<th width="34%"><span>composition</span></th>
</tr>
<tr>
<td width="33%"><span>nl_BE</span></td>
<td width="33%"><span>Flemish</span></td>
<td width="34%"><span>Dutch (Belgium)</span></td>
</tr>
<tr>
<td width="33%"><span>zh_Hans</span></td>
<td width="33%"><span>Simplified Chinese</span></td>
<td width="34%"><span>Chinese (Simplified Han)</span></td>
</tr>
<tr>
<td width="33%"><span>en_GB</span></td>
<td width="33%"><span>British English</span></td>
<td width="34%"><span>English (United Kingdom)</span></td>
</tr>
</table>
<p class="element2"><span>Thus when a complete locale ID is formed by composition, the longest
match in the language type is used, and the remaining fields (if any) added using composition.</span></p>
<p class="element2"><scripts></p>
<p>This element can contain an number of script elements. Each script element provides the
localized name for a script <span class="changed">code</span>, as described in
<i><a href="#Identifiers">Section 3, Identifiers</a></i><span class="removedspan">, given by the value of the type attribute</span>.
<span class="removedspan">The script IDs can be
either the long or short script property values from
PropertyValueAliases.txt in the UCD. </span>(See
<a href="http://unicode.org/reports/tr24/">UAX #24: Script Names</a> [<a href="#Scripts">Scripts</a>]
for more information.) For example, in the language of this locale, the name for the Latin script
might be "Romana", and for the Cyrillic script is "Kyrillica". That would be expressed with the
following.</p>
<blockquote>
<p><script type="<span style="color: blue">Latn</span>"><span style="color: blue">Romana</span></script><br>
<script type="<span style="color: blue">Cyrl</span>"><span style="color: blue">Kyrillica</span></script></p>
</blockquote>
<p class="element2"><territories></p>
<p>This contains a list of elements that provide the user-translated names for territory codes<span class="removedspan">
from [<a href="#ISO3166">ISO3166</a>]</span>, as described in <i>
<a href="#Identifiers">Section 3, Identifiers</a></i>.</p>
<blockquote>
<p><territory type="<span style="color: blue">AF</span>"><span style="color: blue">Afghanistan</span></territory><br>
<territory type="<span style="color: blue">AL</span>"><span style="color: blue">Albania</span></territory><br>
<territory type="<span style="color: blue">DZ</span>"><span style="color: blue">Algeria</span></territory><br>
<territory type="<span style="color: blue">AD</span>"><span style="color: blue">Andorra</span></territory><br>
<territory type="<span style="color: blue">AO</span>"><span style="color: blue">Angola</span></territory><br>
<territory type="<span style="color: blue">US</span>"><span style="color: blue">United States</span></territory></p>
</blockquote>
<p><span class="removedspan"><span>The territory code can also be any of the numeric UN M.49 region codes, excluding <strong>
Selected economic and other groupings. </strong>The data for the area codes is found at [<a href="#UNM49">UNM49</a>].</span></span></p>
<p class="element2"><variants></p>
<p>This contains a list of elements that provide the user-translated names for the <i>variant_code</i>
values described in <i><a href="#Identifiers">Section 3, Identifiers</a></i>.</p>
<blockquote>
<p><variant type="<span style="color: blue">nynorsk</span>"><span style="color: blue">Nynorsk</span></variant></p>
</blockquote>
<p class="element2"><keys></p>
<p>This contains a list of elements that provide the user-translated names for the <i>key</i>
values described in <i><a href="#Identifiers">Section 3, Identifiers</a></i>.</p>
<blockquote>
<p><key type="<span style="color: blue">collation</span>"><span style="color: blue">Sortierung</span></key></p>
</blockquote>
<p class="element2"><types></p>
<p>This contains a list of elements that provide the user-translated names for the <i>type</i>
values described in <i><a href="#Identifiers">Section 3, Identifiers</a></i>. Since the translation of an option name
may depend on the <i>key</i> it is used with, the latter is optionally supplied.</p>
<blockquote>
<p><type type="<span style="color: blue">phonebook</span>" key="<span style="color: blue">collation</span>"><span style="color: blue">Telefonbuch</span></type></p>
</blockquote>
<p class="element2"><span class="changedspan"><measurementSystemNames></span></p>
<p><span class="changedspan">This contains a list of elements that provide the user-translated
names for systems of measurement. The types currently supported are "<span style="color: blue">US</span>",
"<span style="color: blue">metric</span>", and "<span style="color: blue">UK</span>".</span></p>
<blockquote>
<p><span class="changedspan"><measurementSystemName type="<span style="color: blue">US</span>"><span style="color: blue">U.S.</span></type></span></p>
</blockquote>
<p class="note"><span class="changedspan"><b>Note:</b> In the future, we may need to add display
names for the particular measurement units (millimeter vs millimetre vs whatever the Greek,
Russian, etc are), and a message format for positioning those with respect to numbers. E.g.
"{number} {unitName}" in some languages, but "{unitName} {number}" in others.</span></p>
<h3>5.5 <a name="<layout>"><layout></a></h3>
<p><span class="dtd"><!ELEMENT layout ( alias | (orientation?, inList*, special*) ) ></span></p>
<p>This top-level element specifies general layout features. It currently only has one possible
element (other than <special>, which is always permitted).</p>
<p class="element2"><orientation lines="<span style="color: blue">top-to-bottom</span>"
characters="<span style="color: blue">left-to-right</span>" /></p>
<p>The lines and characters attributes specify the default general ordering of lines <span>within
a page</span>, and characters within a line. The values are:</p>
<table>
<caption>Orientation Attributes</caption>
<tr>
<td rowspan="2">Vertical</td>
<td>top-to-bottom</td>
</tr>
<tr>
<td>bottom-to-top</td>
</tr>
<tr>
<td rowspan="2">Horizontal</td>
<td>left-to-right</td>
</tr>
<tr>
<td>right-to-left</td>
</tr>
</table>
<p><span>If the lines value is one of the vertical attributes, then the characters value must be
one of the horizontal attributes, and vice versa. For example, for English the lines are
top-to-bottom, and the characters are left-to-right. For Mongolian<span class="changedspan"> (in
the Mongolian Script)</span> the lines are right-to-left, and the characters are top to bottom.
</span>This does not override the ordering behavior of bidirectional text; it does, however,
supply the paragraph direction for that text (for more information, see
<a href="http://unicode.org/reports/tr9/">UAX #9: The Bidirectional Algorithm</a> [<a href="#BIDI">BIDI</a>]).</p>
<p><span><inList></span></p>
<p><span>The following element controls whether display names (language, territory, etc) are
titlecased in GUI menu lists and the like. It is only used in languages where the normal display
is lowercase, but titlecase is used in lists. <span class="changedspan">There
are two options:</span></span></p>
<pre><span><inList casing="titlecase-words"></span></pre>
<pre><span class="changedspan"><span><inList casing="titlecase-</span>firstword<span>"></span></span></pre>
<p><span class="changedspan">In both cases, the titlecase operation is the
default titlecase function defined by Chapter 3 of <i><span>[<a href="#Unicode">Unicode</a>]</span></i>.
In the second case, only the first word (using the word boundaries for that
locale) will be titlecased. </span>The results<span> can be fine-tuned by using alt="list" on any element where titlecasing as defined by
the Unicode Standard will produce the wrong value. For example, suppose that "turc de Crimée" is a
value, and the titlecase should be "Turc de Crimée". Then that can be expressed using the
alt="list" value.</span></p>
<h3>5.6 <a name="<characters>"><characters></a></h3>
<p><span class="dtd"><!ELEMENT characters (alias | (exemplarCharacters*, mapping*, special*)) ></span></p>
<p><span>The <characters> <span>element provides optional information about characters that are in
common use in the locale, and information that can be helpful in picking resources or data
appropriate for the locale, such as when choosing among character encodings that are typically
used to transmit data in the language of the locale. </span></span>It typically only occurs in a
language locale, not in a language/territory locale.</p>
<p class="element2"><exemplarCharacters><span style="color: blue">[a-zåæø]</span></exemplarCharacters></p>
<p><span>The exemplar character set contains the commonly used letters for a given modern form of
a language, which can be for testing and for determining the appropriate repertoire of letters for
</span><span>charset</span><span> conversion or collation. </span><span>("Letter<span class="changedspan">"</span> is interpreted
broadly, as anything having the property Alphabetic in the [<a href="#UCD">UCD</a>],
which <span class="removedspan">in the Unicode General Category, and </span>also includes syllabaries and ideographs.) It is
not a complete set of letters used for a language, nor should it be considered to apply to
multiple languages in a particular country. Punctuation and other symbols should not be included.</span></p>
<p><span class="changed">There are two sets: the <i>main</i> set should
contain the minimal set required for users of the language, while the <i>
auxiliary</i> exemplar set is designed to encompass additional characters:
those non-native or historical characters that would customarily occur in
common publications, dictionaries, and so on. So, for example, if Irish
newspapers and magazines would commonly have Danish names using å, for
example, then it would be appropriate to include å in the auxiliary exemplar
characters; just not in the main exemplar set. Major style guidelines are
good references for the auxiliary set. Thus for English we have [a-z] in the
main set, and [á à ă â å ä ā æ ç é è ĕ ê ë ē í ì ĭ î ï ī ñ ó ò ŏ ô ö ø ō œ ß
ú ù ŭ û ü ū ÿ] in the auxiliary set.</span></p>
<p><span>In general, the test to see whether or not a letter belongs in the
<span class="changed">main</span> set is based on whether it is
acceptable in that language to always use spellings that avoid that character. For example, the
exemplar character set for en (English) is the set [a-z]. This set does not contain the accented
letters that are sometimes seen in words like "résumé" or "naïve", because it is acceptable in
common practice to spell those words without the accents. The exemplar character set for </span>
<span>fr</span><span> (French), on the other hand, must contain those characters: [a-z </span>
<span>é</span><span> </span><span>è</span><span> ù </span><span>ç</span><span> </span><span>à</span><span>
</span><span>â</span><span> </span><span>ê</span><span> î ô û æ œ </span><span>ë</span><span> ï ÿ].
</span><span class="changed">The main set typically includes those letters
commonly taught in schools as the "alphabet".</span></p>
<p><span>The list of characters is in the </span><a href="#Unicode_Sets"><span>Unicode Set</span></a><span>
format, which allows </span><span>boolean</span><span> combinations of sets of letters, including
those specified by Unicode properties.</span></p>
<p><span>Sequences of characters that act like a single letter in the language — especially in
collation — are included within braces, such as [a-z </span><span>á</span><span> </span><span>é</span><span>
í ó ú ö ü ő ű {</span><span>cs</span><span>} {</span><span>dz</span><span>} {</span><span>dzs</span><span>}
{</span><span>gy</span><span>} ...]. The characters should be in normalized form (NFC). Where
combining marks are used generatively, and apply to a large number of base characters (such as in
Indic scripts), the individual combining marks should be included. Where they are used with only a
few base characters, the specific combinations should be included. Wherever there is not a </span>
<span>precomposed</span><span> character (e.g. single </span><span>codepoint</span><span>) for a
given combination, that must be included within braces. For example, to include sequences from the
</span><a href="http://unicode.org/standard/where/"><span>Where is my Character?</span></a><span>
page on the Unicode site, one would write: [{ch} {tʰ} {x̣} {ƛ̓} {ą́} {i̇́} {ト゚}], but for French
one would just write [a-z </span><span>é</span><span> </span><span>è</span><span> ù ...]. When in
doubt use braces, since it does no harm to included them around single code points: e.g. [a-z {</span><span>é</span><span>}
{</span><span>è</span><span>} {ù} ...].</span></p>
<p><span class="changed">If the letter 'z' were only ever used in the
combination 'tz', then we might have [a-y {tz}] in the main set. (The
language would probably have plain 'z' in the auxiliary set, for use in
foreign words.) If combining characters can be used productively in
combination with a large number of others (such as say Indic matras), then
they are not listed in all the possible combinations, but separately, such
as: </span></p>
<p><span class="changed">[ ॐ ०-९ ऄ-ऋ ॠ ऌ ॡ ऍ-क क़ ख ख़ ग ग़ घ-ज ज़ झ-ड ड़ ढ ढ़
ण-फ फ़ ब-य य़ र-ह ़ ँ-ः ॑-॔ ऽ ् ॽ ा-ॄ ॢ ॣ ॅ-ौ] </span></p>
<p><span>The exemplar character set for Han characters is composed somewhat differently. It is
even harder to draw a clear line for Han characters, since usage is more like a frequency curve
that slowly trails off to the right in terms of decreasing frequency. So for this case, the
exemplar characters simply contain a set of reasonably frequent characters for the language.</span></p>
<p><span class="removedspan"><span>The letters do not necessarily form a complete set (especially for languages using large
character sets, such as CJK</span></span><span><span class="removedspan">). Nor does the list necessarily include
letters that are used in common foreign words used in that language. </span>The ordering of the
characters in the set is irrelevant<span class="changed">, </span> </span>
<span class="changed">but for readability in the XML file the characters
should be in sorted order according to the locale's conventions</span><span>. The set <span class="changedspan">should only contain lower
case characters (except for the </span>special case of Turkish <span>and similar languages</span>,
where the dotted capital I should be included<span class="changedspan">); the uppercase letters
are to be mechanically added when the set is used</span>. </span><span>For more information, see [<a href="#DataFormats">Data
Formats</a>]<span> and the discussion of Special Casing in the Unicode Character Database</span>.</span></p>
<p><span class="removedspan">There can be more than two exemplarCharacters elements, with the second having the type
"auxiliary". This element can be used for additional characters that are used in common foreign
words, dictionaries, etc. used in the locale.</span></p>
<pre><span class="removedspan"><characters>
<exemplarCharacters>[a-zñç]</exemplarCharacters>
<exemplarCharacters type="auxiliary">[ä ö ü ß]</exemplarCharacters>
</characters></span></pre>
<h3><span class="changed">Restrictions</span></h3>
<ol>
<li><span class="changed">The sets are normally restricted to those
letters with a specific
<a href="http://unicode.org/Public/UNIDATA/Scripts.txt">Script </a>
character property (that is, not the values Common or Inherited) or
required
<a href="http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt">
Default_Ignorable_Code_Point</a> characters (such as a non-joiner), or
combining marks, or the
<a href="http://www.unicode.org/Public/UNIDATA/auxiliary/WordBreakProperty.txt">
Word_Break</a> properties <a name="Katakana">Katakana</a>,
<a name="ALetter">ALetter</a>, or <a name="MidLetter">MidLetter</a>.</span></li>
<li><span class="changed">The auxiliary set should not overlap with the
main set. There is one exception to this: Hangul Syllables and CJK
Ideographs can overlap between the sets.</span></li>
<li><span class="changed">Any
<a href="http://unicode.org/Public/UNIDATA/DerivedCoreProperties.txt">
Default_Ignorable_Code_Point</a>s should be in the auxiliary set.</span></li>
</ol>
<p class="element2"><mapping registry="<span style="color: blue">iana</span>" type="<span style="color: blue"><span class="changedspan">iso-2022-jp utf-8</span></span>" <span class="changedspan">alt="<span style="color : blue">email</span>"</span> /></p>
<p><span class="changedspan">The mapping element describes character conversion mapping tables that are commonly used to
encode data in the language of this locale for a particular purpose. Each encoding is identified
by a name from the specified registry. If more than one encoding is used for a particular purpose,
the encodings are listed in the type attribute in order, from most preferred to least. An alt
tag is used to indicate the purpose ("email" or "www" being the most frequent); if it is absent, then the encoding(s)
may be used for all purposes not explicitly specified.</span></p>
<p><span class="changedspan">Each locale may have at most one mapping element tagged with a particular purpose, and at most
one general-purpose mapping element. Inheritance is on an element basis; an element in a sub-locale
overrides an inherited element with the same purpose.</span></p>
<p><span class="removedspan">The registry indicates the source of the encoding. </span>Currently the only registry that can be used
is "iana", which specifies use of an
<a href="http://www.iana.org/assignments/character-sets">IANA name</a>. Note: while IANA
names are not precise for conversion (see <a href="http://unicode.org/reports/tr22/">UTR #22:
Character Mapping Tables</a> [<a href="#CharMapML">CharMapML</a>]), they are sufficient for this
purpose.</p>
<h3>5.7 <a name="<delimiters>"><delimiters></a></h3>
<p><span class="dtd"><!ELEMENT delimiters (alias | (quotationStart*, quotationEnd*,
alternateQuotationStart*, alternateQuotationEnd*, special*)) ></span></p>
<p>The delimiters supply common delimiters for bracketing quotations. The quotation marks are used
with simple quoted text, such as:</p>
<blockquote>
<p>He said, “Don’t be absurd!”</p>
</blockquote>
<p><span class="changedspan">When quotations are nested, the quotation marks and alternate marks
are used in an alternating fashion:</span></p>
<blockquote>
<p>He said, “Remember what the Mad Hatter said: ‘Not the same thing a bit! Why you might just as
well say that “I see what I eat” is the same thing as “I eat what I see”!’”</p>
</blockquote>
<p><code><quotationStart></code><span style="color: blue">“</span><code></quotationStart></code><br>
<code><quotationEnd></code><span style="color: blue">”</span><code></quotationEnd></code><br>
<code><alternateQuotationStart></code><span style="color: blue">‘</span><code></alternateQuotationStart></code><br>
<code><alternateQuotationEnd></code><span style="color: blue">’</span><code></alternateQuotationEnd></code></p>
<h3>5.8 <a name="<measurement>"><measurement></a></h3>
<p><span class="dtd"><!ELEMENT measurement (alias | (measurementSystem?, paperSize?, special*)) ></span></p>
<p><span class="changedspan">The measurement element is deprecated in the main LDML files, because
the data is more appropriately organized as connected to territories, not to linguistic data.
Instead, the similar element in the
supplemental data file should be used.</span></p>
<pre><span class="removedspan"><measurementSystem type="<span style="color: blue">US</span>"/></span></pre>
<p><span class="removedspan">The measurement system is the normal measurement system in common
everyday use (except for date/time). The values are "metric" (= ISO 1000), "US", or "UK"; others
may be added over time. <span>The "US" value indicates the customary system of measurement with
feet, inches, pints, quarts, etc. as used in the United States. The "UK" value indicates the
customary system of measurement with feet, inches, pints, quarts, etc. as used in the United
Kingdom. It is also called the Imperial system: the pint, quart, etc. are different sizes than in
"US".</span></span></p>
<p class="note"><span class="removedspan"><b>Note:</b> In the future, we may need to add display
names for the particular measurement units (millimeter vs millimetre vs whatever the Greek,
Russian, etc are), and a message format for position those with respect to numbers. E.g. "{number}
{unitName}" in some languages, but "{unitName} {number}" in others.</span></p>
<p class="note"><span class="removedspan"><b>Note:</b><i> Numbers indicating measurements should
<b>never</b> be interchanged without known dimensions. You never want the number 3.51 interpreted
as 3.51 feet by one user and 3.51 meters by another. However, this element can be used to convert
dimensioned numbers into the user's desired notation: so the value of 3.51 meters can be formatted
as 11.52 feet on a particular user's system.</i></span></p>
<p><span class="removedspan"><span><paperSize></span></span></p>
<p><span class="removedspan">The paperSize element gives <span>the height and width of paper used
for </span>normal business letters. <span>The units for the numbers are always in millimeters. For
example, the paperSize in the root (the default) is A4:</span></span></p>
<pre><span class="removedspan"><span><paperSize>
<height><span style="color: blue">297</span></height>
<width><span style="color: blue">210</span></width>
</paperSize></span></span></pre>
<p><span class="removedspan"><span>An example of locale data that differs from this would be
en-US:</span></span></p>
<pre><span class="removedspan"><paperSize>
<height><span style="color: blue">279</span></height>
<width><span style="color: blue">216</span></width>
</paperSize></span></pre>
<h3>5.9 <a name="<dates>"><dates></a></h3>
<p><span class="dtd"><!ELEMENT dates (alias | (localizedPatternChars*, calendars?, timeZoneNames?,
special*)) ></span></p>
<p>This top-level element contains information regarding the format and parsing of dates and
times. The <span class="changed">data format</span> is based on the Java/ICU format. Most of these are fairly self-explanatory, except
<span>the</span><i><span> week </span></i><span>elements</span><i><span>,</span></i><span> </span>
<i><span>localizedPatternChars</span></i><span>, and the meaning of the pattern characters</span>.
For information on this, and more information on other elements and attributes, <span>see </span>
<a href="#Date_Format_Patterns"><span>Appendix F: Date Format Patterns</span></a><span>.</span></p>
<h4>5.9.1 <a name="<calendars>"><calendars></a></h4>
<p><span class="dtd"><!ELEMENT calendar (alias | (months?, monthNames?, monthAbbr?, days?,
dayNames?, dayAbbr?, <span class="changedspan">quarters?, </span>week?, am?, pm?, eras?,
dateFormats?, timeFormats?, dateTimeFormats?, fields*, special*))></span></p>
<p>This element contains multiple <calendar> elements, each of which specifies the fields used for
formatting and parsing dates and times according to the given calendar. The month
<span class="changedspan">and quarter </span>names are identified numerically, starting at 1. The
day <span class="changedspan">(of the week)</span> names are identified with short strings, since there is no universally-accepted numeric
designation.</p>
<p>Many calendars will only differ from the Gregorian Calendar in the year and era values. For
example, the Japanese calendar will have many more eras (one for each Emperor), and the years will
be numbered within that era. All calendar <span>data inherits</span> from the Gregorian calendar
in the same locale data <span>(if not present in the chain up to root)</span>, so only the
differing data will be present.<span> See <a href="#Multiple_Inheritance">Multiple Inheritance</a>.</span></p>
<p><span class="dtd"><!ELEMENT months ( alias | (default?, monthContext*, special*)) ><br>
<!ELEMENT monthContext ( alias | (default?, monthWidth*, special*)) ><br>
<!ELEMENT monthWidth ( alias | (month*, special*)) ></span></p>
<p><span class="dtd"><!ELEMENT days ( alias | (default?, dayContext*, special*)) ><br>
<!ELEMENT dayContext ( alias | (default?, dayWidth*, special*)) ><br>
<!ELEMENT dayWidth ( alias | (day*, special*)) ></span></p>
<p><span class="dtd"><span class="changedspan"><!ELEMENT quarters ( alias | (default?,
quarterContext*, special*)) ><br>
<!ELEMENT quarterContext ( alias | (default?, quarterWidth*, special*)) ><br>
<!ELEMENT quarterWidth ( alias | (quarter*, special*)) ></span></span></p>
<p><span class="changedspan">Month, day, and quarter</span> names may vary along two axes: the
width and the context. The context is either <i>format</i> (the default), the form used within a
date format string (such as "Saturday, November 12<sup>th</sup>", or <i>stand-alone</i>, the form
used independently, such as in Calendar headers. The width can be <i>wide</i> (the default), <i>
abbreviated</i>, or <i>narrow</i>. The format values must be distinct; that is, "S" could not be
used both for Saturday and for Sunday. The same is not true for stand-alone values; they might
only be distinguished by context, especially in the narrow format. That format is typically used
in calendar headers; it must be the shortest possible width, no more than one character<span> (or
grapheme cluster) in stand-alone values,<span class="changedspan"> and the shortest possible
widths (in terms of grapheme clusters) in format values.</span></span></p>
<p>If the stand-alone form does not exist (in the chain up to root), then it inherits from the
format form. See <a href="#Multiple_Inheritance">Multiple Inheritance</a>.<span> If the narrow
format does not exist, it inherits from the abbreviated form; if the abbreviated format does not
exist, it inherits from the wide format.</span></p>
<p>The older monthNames, dayNames, and monthAbbr, dayAbbr are maintained for backwards
compatibility. They are equivalent to: using the months element with the context type="<span style="color: blue">format</span>"
and the width type="<span style="color: blue">wide</span>" (for ...Names) and type="<span style="color: blue">narrow</span>"
(for ...Abbr), respectively. <span class="changedspan">The minDays, firstDay,
weekendStart, and weekendEnd elements are also deprecated; there are new
elements in supplemental data for this data.</span></p>
<p class="example">Example:</p>
<pre> <calendar type="<span style="color: blue">gregorian</span>">
<span> <months>
<default type="<span style="color: blue">format</span>"/>
<monthContext type="<span style="color: blue">format</span>">
<default type="<span style="color: blue">wide</span>"/>
<monthWidth type="<span style="color: blue">wide</span>">
<month type="<span style="color: blue">1</span>"><span style="color: blue">January</span></month>
<month type="<span style="color: blue">2</span>"><span style="color: blue">February</span></month>
...
<month type="<span style="color: blue">11</span>"><span style="color: blue">November</span></month>
<month type="<span style="color: blue">12</span>"><span style="color: blue">December</span></month>
</monthWidth>
<monthWidth type="<span style="color: blue">abbreviated</span>">
<month type="<span style="color: blue">1</span>"><span style="color: blue">Jan</span></month>
<month type="<span style="color: blue">2</span>"><span style="color: blue">Feb</span></month>
...
<month type="<span style="color: blue">11</span>"><span style="color: blue">Nov</span></month>
<month type="<span style="color: blue">12</span>"><span style="color: blue">Dec</span></month>
</monthWidth>
<monthContext type="<span style="color: blue">stand-alone</span>">
<default type="<span style="color: blue">wide</span>"/>
<monthWidth type="<span style="color: blue">wide</span>">
<month type="<span style="color: blue">1</span>"><span style="color: blue">Januaria</span></month>
<month type="<span style="color: blue">2</span>"><span style="color: blue">Februaria</span></month>
...
<month type="<span style="color: blue">11</span>"><span style="color: blue">Novembria</span></month>
<month type="<span style="color: blue">12</span>"><span style="color: blue">Decembria</span></month>
</monthWidth>
<monthWidth type="<span style="color: blue">narrow</span>">
<month type="<span style="color: blue">1</span>"><span style="color: blue">J</span></month>
<month type="<span style="color: blue">2</span>"><span style="color: blue">F</span></month>
...
<month type="<span style="color: blue">11</span>"><span style="color: blue">N</span></month>
<month type="<span style="color: blue">12</span>"><span style="color: blue">D</span></month>
</monthWidth>
</monthContext>
</months>
<days>
<default type="<span style="color: blue">format</span>"/>
<dayContext type="<span style="color: blue">format</span>">
<default type="<span style="color: blue">wide</span>"/>
<dayWidth type="<span style="color: blue">wide</span>">
<day type="<span style="color: blue">sun</span>"><span style="color: blue">Sunday</span></day>
<day type="<span style="color: blue">mon</span>"><span style="color: blue">Monday</span></day>
...
<day type="<span style="color: blue">fri</span>"><span style="color: blue">Friday</span></day>
<day type="<span style="color: blue">sat</span>"><span style="color: blue">Saturday</span></day>
</dayWidth>
<dayWidth type="<span style="color: blue">abbreviated</span>">
<day type="<span style="color: blue">sun</span>"><span style="color: blue">Sun</span></day>
<day type="<span style="color: blue">mon</span>"><span style="color: blue">Mon</span></day>
...
<day type="<span style="color: blue">fri</span>"><span style="color: blue">Fri</span></day>
<day type="<span style="color: blue">sat</span>"><span style="color: blue">Sat</span></day>
</dayWidth>
<dayWidth type="<span style="color: blue">narrow</span>">
<day type="<span style="color: blue">sun</span>"><span style="color: blue">Su</span></day>
<day type="<span style="color: blue">mon</span>"><span style="color: blue">M</span></day>
...
<day type="<span style="color: blue">fri</span>"><span style="color: blue">F</span></day>
<day type="<span style="color: blue">sat</span>"><span style="color: blue">Sa</span></day>
</dayWidth>
</dayContext>
<dayContext type="<span style="color: blue">stand-alone</span>">
<dayWidth type="<span style="color: blue">narrow</span>">
<day type="<span style="color: blue">sun</span>"><span style="color: blue">S</span></day>
<day type="<span style="color: blue">mon</span>"><span style="color: blue">M</span></day>
...
<day type="<span style="color: blue">fri</span>"><span style="color: blue">F</span></day>
<day type="<span style="color: blue">sat</span>"><span style="color: blue">S</span></day>
</dayWidth>
</dayContext></span>
</days>
<span class="changedspan"> <quarters>
<default type="<span style="color: blue">format</span>"/>
<quarterContext type="<span style="color: blue">format</span>">
<default type="<span style="color: blue">abbreviated</span>"/>
<quarterWidth type="<span style="color: blue">abbreviated</span>">
<quarter type="<span style="color: blue">1</span>"><span style="color: blue">Q1</span></quarter>
<quarter type="<span style="color: blue">2</span>"><span style="color: blue">Q2</span></quarter>
<quarter type="<span style="color: blue">3</span>"><span style="color: blue">Q3</span></quarter>
<quarter type="<span style="color: blue">4</span>"><span style="color: blue">Q4</span></quarter>
</quarterWidth>
<quarterWidth type="<span style="color: blue">wide</span>">
<quarter type="<span style="color: blue">1</span>"><span style="color: blue">1st quarter</span></quarter>
<quarter type="<span style="color: blue">2</span>"><span style="color: blue">2nd quarter</span></quarter>
<quarter type="<span style="color: blue">3</span>"><span style="color: blue">3rd quarter</span></quarter>
<quarter type="<span style="color: blue">4</span>"><span style="color: blue">4th quarter</span></quarter>
</quarterWidth>
</quarterContext>
</quarters>
</span>
<span class="removedspan"> <week>
<minDays count="<span style="color: blue">1</span>"/>
<firstDay day="<span style="color: blue">sun</span>"/>
<weekendStart day="<span style="color: blue">fri</span>" time="<span style="color: blue">18:00</span>"/>
<weekendEnd day="<span style="color: blue">sun</span>" time="<span style="color: blue">18:00</span>"/>
</week>
</span>
<am><span style="color: blue">AM</span></am>
<pm><span style="color: blue">PM</span></pm>
<eras>
<eraAbbr>
<era type="<span style="color: blue">0</span>"><span style="color: blue">BC</span></era>
<era type="<span style="color: blue">1</span>"><span style="color: blue">AD</span></era>
</eraAbbr>
<eraName<span class="changedspan">s</span>>
<era type="<span style="color: blue">0</span>"><span style="color: blue">Before Christ</span></era>
<era type="<span style="color: blue">1</span>"><span style="color: blue">Anno Domini</span></era>
</eraName<span class="changedspan">s</span>>
<span class="changedspan"><eraNarrow>
<era type="<span style="color: blue">0</span>"><span style="color: blue">B</span></era>
<era type="<span style="color: blue">1</span>"><span style="color: blue">A</span></era>
</eraNarrow></span>
</eras></pre>
<p><span><a name="dateFormats"><dateFormats></a></span></p>
<p><span class="dtd"><!ELEMENT dateFormats (alias | (default?, dateFormatLength*, special*)) ><br>
<!ELEMENT dateFormatLength (alias | (default?, dateFormat*, special*)) ><br>
<!ELEMENT dateFormat (alias | (pattern*, displayName?, special*)) ></span></p>
<p><span>Date formats have the following form:</span></p>
<pre> <dateFormats>
<default type=”<span style="color: blue">medium</span>”/>
<dateFormatLength type=”<span style="color: blue">full</span>”>
<dateFormat>
<pattern><span style="color: blue">EEEE, MMMM d, yyyy</span></pattern>
</dateFormat>
</dateFormatLength>
<dateFormatLength type="<span style="color: blue">medium</span>">
<default type="<span style="color: blue">DateFormatsKey2</span>">
<dateFormat type="<span style="color: blue">DateFormatsKey2</span>">
<pattern><span style="color: blue">MMM d, yyyy</span></pattern>
</dateFormat>
<dateFormat type="<span style="color: blue">DateFormatsKey3</span>">
<pattern><span style="color: blue">MMM dd, yyyy</span></pattern>
</dateFormat>
</dateFormatLength>
<dateFormats></pre>
<p><span><a name="timeFormats"><timeFormats></a></span></p>
<p><span class="dtd"><!ELEMENT timeFormats (alias | (default?, timeFormatLength*, special*)) ><br>
<!ELEMENT timeFormatLength (alias | (default?, timeFormat*, special*)) ><br>
<span style="background-color: #CCCCFF"><!ELEMENT timeFormat (alias | (pattern*, displayName?,
special*)) ></span></span></p>
<p><span>Time formats have the following form:</span></p>
<pre> <timeFormats>
<default type="<span style="color: blue">medium</span>"/>
<timeFormatLength type=”<span style="color: blue">full</span>”>
<timeFormat>
<displayName><span style="color: blue">DIN 5008 (EN 28601)</span></displayName>
<pattern><span style="color: blue">h:mm:ss a z</span></pattern>
</timeFormat>
</timeFormatLength>
<timeFormatLength type="<span style="color: blue">medium</span>">
<timeFormat>
<pattern><span style="color: blue">h:mm:ss a</span></pattern>
</timeFormat>
</timeFormatLength>
</timeFormats></pre>
<p><span class="changedspan">The preference of 12 hour vs 24 hour for the
locale should be derived from the short timeFormat. If the hour symbol is
"h" or "K" (of various lengths) then the format is 12 hour; otherwise it is
24 hour.</span></p>
<p><span class="changedspan">Date/Time formats have the following form:</span></p>
<pre> <dateTimeFormats>
<default type="<span style="color: blue">medium</span>"/>
<dateTimeFormatLength type=”<span style="color: blue">full</span>”>
<dateTimeFormat>
<pattern><span style="color: blue">{0} {1}</span></pattern>
</dateTimeFormat>
</dateTimeFormatLength>
<span class="changedspan"> <availableFormats>
<dateFormatItem><span style="color: blue">d. MMM yy</span></dateFormatItem>
<dateFormatItem><span style="color: blue">hh:mm:ss a</span></dateFormatItem>
<dateFormatItem><span style="color: blue">MMMM yyyy</span></dateFormatItem>
<dateFormatItem><span style="color: blue">MMM yy</span></dateFormatItem>
. . .
</availableFormats>
<appendItems>
<appendItem request="<span style="color: blue">G</span>"><span style="color: blue">{0} {1}</span></appendItem>
<appendItem request="<span style="color: blue">w</span>"><span style="color: blue">{0} ({2}: {1})</span></appendItem>
. . .
</appendItems></span>
</dateTimeFormats></pre>
<pre><span> </calendar>
<calendar type="<span style="color: blue">buddhist</span>">
<eras>
<span class="changedspan"><eraAbbr></span>
<era type="<span style="color: blue">0</span>"><span style="color: blue">BE</span></era>
<span class="changedspan"></eraAbbr></span>
</eras>
</calendar></span></pre>
<p><span><span><a name="dateTimeFormats"><dateTimeFormats></a></span></span></p>
<p><span class="dtd"><span class="changedspan"><!ELEMENT dateTimeFormats (alias | (default?,
dateTimeFormatLength*, availableFormats*, appendItems*, special*)) ></span><br>
<span class="removedspan"><!ATTLIST dateTimeFormats draft ( true | false ) #IMPLIED ><br>
<!ATTLIST dateTimeFormats validSubLocales CDATA #IMPLIED ></span><br>
<span class="changedspan"><!ELEMENT dateTimeFormatLength (alias | (dateTimeFormat*, special*))><br>
<!ELEMENT dateTimeFormat (alias | (pattern*, special*))><br>
<!ELEMENT availableFormats (alias | (dateFormatItem*, special*))><br>
<!ELEMENT appendItems (alias | (appendItem*, special*))><br>
<!ATTLIST appendItem request CDATA ><br>
</span></span></p>
<p>These formats allow for date and time formats to be composed in various ways. The
<span class="changedspan">dateTimeFormat element</span> works like the dateFormats and timeFormats,
except that the pattern is of the form "{0} {1}", where {0} is replaced by the date format, and
{1} is replaced by the time format. </p>
<p><span class="changedspan">The availableFormats element and its subelements provide a more
flexible formatting mechanism than the predefined list of patterns represented by dateFormatLength,
timeFormatLength, and dateTimeFormatLength. Instead, there is an open-ended list of patterns
(represented by dateFormatItem elements as well as the predefined patterns mentioned above) that
can be matched against a requested set of calendar fields and field lengths. Software can look
through the list and find the pattern that best matches the original request, based on the desired
calendar fields and lengths. For example, the full month and year may be needed for a calendar
application; the request is MMMMyyyy, but the best match may be "yyyy MMMM" or even "G yy MMMM",
depending on the locale and calendar.</span></p>
<p><span class="changedspan">The id attribute is a so-called "skeleton",
containing only field information, and in a canonical order. Examples are "yyyyMMMM"
for year + full month, or "MMMd" for abbreviated month + day.</span></p>
<p><span class="changedspan">In case the best match does not include all the requested calendar
fields, the appendItems element describes how to append needed fields to one of the existing
formats. Each appendItem element covers a single calendar field. In the pattern, {0} represents the format string, {1} the data content of the field, and
{2} the display name of the field (see <a href="#Calendar_Fields">Calendar Fields</a>). </span></p>
<p><span><a name="week"><week></a></span></p>
<p><span class="dtd"><!ELEMENT week (alias | (minDays?, firstDay?, weekendStart?, weekendEnd?,
special*))></span></p>
<p class="note"><span class="changedspan">The week element is deprecated in the main LDML files,
because the data is more appropriately organized as connected to territories, not to linguistic
data. Instead, the similar element in the
supplemental data file should be used.</span></p>
<p class="note"><span class="removedspan">The weekendStart time defaults to "00:00:00" (midnight
at the start of the day). The weekendEnd time defaults to "24:00:00" (midnight at the end of the
day). <span>(That is, Friday at 24:00:00 is the same time as Saturday at 00:00:00.) Thus the
following are equivalent:</span></span></p>
<table>
<tr>
<td><span class="removedspan"><span><weekendStart day="<span style="color: blue">sat</span>"/><br>
<weekendEnd day="<span style="color: blue">sun</span>"/></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span><weekendStart day="<span style="color: blue">sat</span>"
time="<span style="color: blue">00:00</span>"/><br>
<weekendEnd day="<span style="color: blue">sun</span>" time="<span style="color: blue">24:00</span>"/></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span><weekendStart day="<span style="color: blue">fri</span>"
time="<span style="color: blue">24:00</span>"/><br>
<weekendEnd day="<span style="color: blue">mon</span>" time="<span style="color: blue">00:00</span>"/></span></span></td>
</tr>
</table>
<p><span class="removedspan"><span>What is meant by the weekend varies from country to country. It
is typically when most non-retail businesses are closed. The time should not be specified unless
it is a well-recognized part of the day.</span></span></p>
<p><span class="removedspan">For information on the other fields, see <span>
<a href="#Date_Format_Patterns">Appendix F: Date Format Patterns</a>.</span></span></p>
<p><br>
<a name="Calendar_Fields"><span>Calendar Fields</span></a></p>
<p><span class="dtd"><!ELEMENT fields ( alias | (field*, special*)) ><br>
<!ELEMENT field ( alias | (displayName?, relative*, special*)) ></span></p>
<p><span>Translations may be supplied for names of calendar fields (elements of a calendar, such
as Day, Month, Year, Hour, etc.), and for relative values for those fields (for example, the day
with<br>
relative value -1 is "Yesterday"). Where there is not a convenient, customary word or phrase in a
particular language for a relative value, it should be omitted.</span></p>
<p><span>Here are examples for English and German. Notice that the German has more fields than the
English does.</span></p>
<pre><span><calendar>
<fields>
...
<field type='day'>
<displayName>Day</displayName>
<relative type='-1'>Yesterday</relative>
<relative type='0'>Today</relative>
<relative type='1'>Tomorrow</relative>
</field>
...
</fields>
</calendars></span></pre>
<pre><span><calendar>
<fields>
...
<field type='day'>
<displayName>Tag</displayName>
<relative type='-2'>Vorgestern</relative>
<relative type='-1'>Gestern</relative>
<relative type='0'>Heute</relative>
<relative type='1'>Morgen</relative>
<relative type='2'>Übermorgen</relative>
</field>
...
</fields>
</calendars></span></pre>
<h4>5.9.2 <a name="<timeZoneNames>"><timeZoneNames></a></h4>
<p><span class="dtd"><!ELEMENT timeZoneNames (alias | (hourFormat*, hoursFormat*, gmtFormat*,
regionFormat*, fallbackFormat*, abbreviationFallback*, preferenceOrdering*, singleCountries*,
default*, zone*, special*)) ><br>
<!ELEMENT zone (alias | ( long*, short*, exemplarCity*, special*)) ></span></p>
<p>The timezone IDs <span>(tzid) </span>are language-independent, and follow the <i>
<span class="changedspan">TZ</span> timezone database</i> [<a href="#Olson">Olson</a>]. However,
the display names for those IDs can vary by locale. The generic time is so-called <i>wall-time</i>;
what clocks use when they are correctly switched from standard to daylight time at the mandated
time of the year.</p>
<p><span>Unfortunately, the canonical tzid's (those in zone.tab) are not stable: may change in
each release of the </span><i><span class="changedspan">TZ</span></i><span> Timezone database. In
CLDR, however, stability of identifiers is very important. So the canonical IDs in CLDR are kept
stable as described in Appendix L: <a href="#Canonical_Form">Canonical Form</a>.</span></p>
<p><span>The following is an example of timezone data. Although this is an example of possible
data, in most cases only the exemplarCity is needs translation. And that does not even need to be
present, if a country only has a single timezone. As always, t</span><span>he <i>type</i> field
for each zone is the identification of that zone. It is not to be translated.</span></p>
<pre><zone type="<span style="color: blue">America/Los_Angeles</span>" >
<long>
<generic><span style="color: blue">Pacific Time</span></generic>
<standard><span style="color: blue">Pacific Standard Time</span></standard>
<daylight><span style="color: blue">Pacific Daylight Time</span></daylight>
</long>
<short>
<generic><span style="color: blue">PT</span></generic>
<standard><span style="color: blue">PST</span></standard>
<daylight><span style="color: blue">PDT</span></daylight>
</short>
<exemplarCity><span style="color: blue">San Francisco</span></exemplarCity>
</zone>
<zone type="<span style="color: blue">Europe/London</span>">
<long>
<generic><span style="color: blue">British Time</span></generic>
<standard><span style="color: blue">British Standard Time</span></standard>
<daylight><span style="color: blue">British Daylight Time</span></daylight>
</long>
<exemplarCity><span style="color: blue">York</span></exemplarCity>
</zone></pre>
<p class="note"><b>Note: </b>Transmitting "14:30" with no other context is incomplete unless it
contains information about the time zone. Ideally one would transmit neutral-format date/time
information, commonly in UTC, and localize as close to the user as possible. (For more about UTC,
see [<a href="#UTCInfo">UTCInfo</a>].)</p>
<p class="note">The conversion from local time into UTC depends on the particular time zone rules,
which will vary by location. The standard data used for converting local time (sometimes called <i>
wall time</i>) to UTC and back is the <i><span class="changedspan">TZ</span> Data</i> [<a href="#Olson">Olson</a>],
used by Linux, UNIX, Java, ICU, and others. The data includes rules for matching the laws for time
changes in different countries. For example, for the US it is:</p>
<blockquote>
<p class="note">"During the period commencing at 2 o'clock antemeridian on the first Sunday of
April of each year and ending at 2 o'clock antemeridian on the last Sunday of October of each
year, the standard time of each zone established by sections 261 to 264 of this title, as
modified by section 265 of this title, shall be advanced one hour..." (United States Law - 15
U.S.C. §6(IX)(260-7)).</p>
</blockquote>
<p class="note">Each region that has a different timezone or daylight savings time rules, either
now or at any time <span>back to 1970</span>, is given a unique internal ID, such as <code>
Europe/Paris</code>. <span>(Some IDs are also distinguished on the basis of differences before
1970.)</span> As with currency codes, these are internal codes. <span>A localized string
associated with these is provided for users</span> (such as in the Windows<i> Control
Panels>Date/Time>Time Zone</i>).</p>
<p class="note">Unfortunately, laws change over time, and will continue to change in the future,
both for the boundaries of timezone regions and the rules for daylight savings. Thus the <i>
<span class="changedspan">TZ</span></i> data is continually being augmented. Any two
implementations using the same version of the <i><span class="changedspan">TZ</span></i> data will
get the same results for the same IDs (assuming a correct implementation). However, if
implementations use different versions of the data they may get different results. So if precise
results are required then both the <i><span class="changedspan">TZ</span></i> ID and the <i>
<span class="changedspan">TZ</span></i> data version must be transmitted between the different
implementations.</p>
<p class="note"><span>For more information, see [<a href="#DataFormats">Data Formats</a>].</span></p>
<p>The following subelements of timezoneNames are used to control the fallback process described
in <a href="#Time_Zone_Fallback"><span>Appendix J: Time Zone Display Names</span></a>.</p>
<table cellSpacing="0" cellPadding="4" border="1">
<tr>
<th>Element Name</th>
<th>Data Examples</th>
<th>Results/Comment</th>
</tr>
<tr>
<td rowSpan="2"><span>hourFormat</span></td>
<td rowSpan="2">"+HHmm;-HHmm"</td>
<td>"+1200"</td>
</tr>
<tr>
<td>"-1200"</td>
</tr>
<tr>
<td><span>hoursFormat</span></td>
<td>"{0}/{1}"</td>
<td>"-0800/-0700"</td>
</tr>
<tr>
<td rowSpan="2"><span>gmtFormat</span></td>
<td>"GMT{0}"</td>
<td>"GMT-0800"</td>
</tr>
<tr>
<td>"{0}ВпГ"</td>
<td>"-0800ВпГ"</td>
</tr>
<tr>
<td rowSpan="2"><span>regionFormat</span></td>
<td>"{0} Time"</td>
<td>"Japan Time"</td>
</tr>
<tr>
<td>"Tiempo de {0}"</td>
<td>"Tiempo de Japón"</td>
</tr>
<tr>
<td><span>fallbackFormat</span></td>
<td>"Tiempo de «{0}»"</td>
<td>"Tiempo de «Tokyo»"</td>
</tr>
<tr>
<td><span>abbreviationFallback</span></td>
<td><span>type="GMT"</span></td>
<td><span>causes any "long" match to be skipped in Timezone fallbacks</span></td>
</tr>
<tr>
<td><span>preferenceOrdering</span></td>
<td><span>type=</span><span>"America/Mexico_City America/Chihuahua America/New_York"</span></td>
<td><span>a preference ordering among modern zones</span></td>
</tr>
<tr>
<td><span>singleCountries</span></td>
<td><span>list="America/Godthab America/Santiago America/Guayaquil Europe/Madrid
Pacific/Auckland Pacific/Tahiti Europe/Lisbon..."</span></td>
<td><span>uses country name alone</span></td>
</tr>
</table>
<h3><br>
5.10 <a name="<numbers>"><numbers></a></h3>
<p><span class="dtd"><!ELEMENT numbers (alias | (symbols?, decimalFormats?, scientificFormats?,
percentFormats?, currencyFormats?, currencies?, special*)) ></span></p>
<p>The numbers element supplies information for formatting and parsing numbers and currencies. It
has the following sub-elements: <symbols>, <decimalFormats>, <scientificFormats>, <percentFormats>,
<currencyFormats>, and <currencies>. The currency IDs are from [<a href="#ISO4217">ISO4217</a>]<span>
(plus some additional common-use codes)</span>. For more information, including the pattern
structure<span>, see </span><a href="#Number_Format_Patterns"><span>Appendix G: Number Pattern
Format</span></a><span>.</span></p>
<h4><span class="changedspan">5.10.1 <a name="Number_Symbols">Number Symbols</a></span></h4>
<p><span class="dtd"><!ELEMENT symbols (alias | (decimal?, group?, list?, percentSign?,
nativeZeroDigit?, patternDigit?, plusSign?, minusSign?, exponential?, perMille?, infinity?, nan?,
special*)) ></span></p>
<pre><symbols>
<decimal><span style="color: blue">.</span></decimal>
<group><span style="color: blue">,</span></group>
<list><span style="color: blue">;</span></list>
<percentSign><span style="color: blue">%</span></percentSign>
<nativeZeroDigit><span style="color: blue">0</span></nativeZeroDigit>
<patternDigit><span style="color: blue">#</span></patternDigit>
<plusSign><span style="color: blue">+</span></plusSign>
<minusSign><span style="color: blue">-</span></minusSign>
<exponential><span style="color: blue">E</span></exponential>
<perMille><span style="color: blue">‰</span></perMille>
<infinity><span style="color: blue">∞</span></infinity>
<nan><span style="color: blue">☹</span></nan>
</symbols></pre>
<p><span class="dtd"><!ELEMENT decimalFormats (alias | (default?, decimalFormatLength*,
special*))><br>
<!ELEMENT decimalFormatLength (alias | (default?, decimalFormat*, special*))><br>
<!ELEMENT decimalFormat (alias | (pattern*, special*)) ><br>
</span><span>(scientificFormats, percentFormats, and currencyFormats have the same structure)</span></p>
<pre><decimalFormats>
<decimalFormatLength type="<span style="color: blue">long</span>">
<decimalFormat>
<pattern><span style="color: blue">#,##0.###</span></pattern>
</decimalFormat>
</decimalFormatLength>
</decimalFormats></pre>
<pre><scientificFormats>
<default type="<span style="color: blue">long</span>"/>
<scientificFormatLength type="<span style="color: blue">long</span>">
<scientificFormat>
<pattern><span style="color: blue">0.000###E+00</span></pattern>
</scientificFormat>
</scientificFormatLength>
<scientificFormatLength type="<span style="color: blue">medium</span>">
<scientificFormat>
<pattern><span style="color: blue">0.00##E+00</span></pattern>
</scientificFormat>
</scientificFormatLength>
</scientificFormats></pre>
<pre><percentFormats>
<percentFormatLength type="<span style="color: blue">long</span>">
<percentFormat>
<pattern><span style="color: blue">#,##0%</span></pattern>
</percentFormat>
</percentFormatLength>
</percentFormats></pre>
<pre><currencyFormats>
<currencyFormatLength type="<span style="color: blue">long</span>">
<currencyFormat>
<pattern><span style="color: blue">¤ #,##0.00;(¤ #,##0.00)</span></pattern>
</currencyFormat>
</currencyFormatLength>
</currencyFormats></pre>
<h4><span class="changedspan">5.10.2 <a name="Currencies">Currencies</a></span></h4>
<p><span class="dtd"><!ELEMENT currency (alias | (pattern*, displayName*, symbol*, pattern*,
decimal*, group*, special*)) ></span></p>
<p><span class="changedspan"><b>Note:</b> pattern appears twice in the above. The first is for
consistency with all other cases of pattern + displayName; the second is for backwards
compatibility.</span></p>
<pre><currencies>
<currency type="<span style="color: blue">USD</span>">
<displayName><span style="color: blue">Dollar</span></displayName>
<symbol><span style="color: blue">$</span></symbol>
</currency>
<currency type ="<span style="color: blue">JPY</span>">
<displayName><span style="color: blue">Yen</span></displayName>
<symbol><span style="color: blue">¥</span></symbol>
</currency>
<currency type ="<span style="color: blue">INR</span>">
<displayName><span style="color: blue">Rupee</span></displayName>
<symbol choice="<span style="color: blue">true</span>"><span style="color: blue">0≤Rf|1≤Ru|1&lt;Rf</span></symbol>
</currency>
<currency type="PTE">
<displayName><span style="color: blue">Escudo</span></displayName>
<symbol><span style="color: blue">$</span></symbol>
</currency>
</currencies></pre>
<p>In formatting currencies, the currency number format is used with the appropriate symbol from
<currencies>, according to the currency code. The <currencies> list can contain codes that are no
longer in current use, such as PTE. The choice attribute can be used to indicate that the <span>
value uses a pattern interpreted as in </span><a href="#Choice_Patterns"><span>Appendix H: Choice
Patterns</span></a><span>.</span></p>
<p><span>When the currency symbol is substituted into a pattern, there may be some further
modifications, according to the following.</span></p>
<pre><span><currencySpacing>
<beforeCurrency>
<currencyMatch>[:letter:]</currencyMatch>
<surroundingMatch>[:digit:]</surroundingMatch>
<insertBetween><span class="changedspan">&#x00a0;</span></insertBetween>
</beforeCurrency>
<afterCurrency>
<currencyMatch>[:letter:]</currencyMatch>
<surroundingMatch>[:digit:]</surroundingMatch>
<insertBetween><span class="changedspan">&#x00a0;</span></insertBetween>
</afterCurrency>
</currencySpacing></span>
</pre>
<p><span>This element controls whether additional characters are inserted on the boundary between
the symbol and the pattern. For example, in the above, inserting the symbol "US$" into the pattern
"#,##0.00¤" would result in an extra <span class="changedspan">no-break</span> space inserted before the symbol, eg "#,##0.00 US$", while
inserting into the pattern "¤#,##0.00" would not, eg "US$#,##0.00". That is because the
afterCurrency condition matches and the beforeCurrency condition doesn't. For more information on
the matching used in the currencyMatch and surroundingMatch elements, see Appendix E:
<a href="#Unicode_Sets">Unicode Sets</a>.</span></p>
<p>Currencies can also contain <span>optional</span> grouping, decimal data<span>, and pattern
elements</span>. This data is inherited from the <symbols> in the same locale data<span> (if not
present in the chain up to root)</span>, so only the <i>differing</i> data will be present. <span>
See <a href="#Multiple_Inheritance">Multiple Inheritance</a>.</span></p>
<p class="note"><b>Note: </b><i>Currency values should <b>never</b> be interchanged without a
known currency code. You never want the number 3.5 interpreted as $3.5 by one user and ¥3.5 by
another. </i>Locale data contains localization information for currencies, not a currency value
for a country. A currency amount logically consists of a numeric value, plus an accompanying
currency code (or equivalent). The currency code may be implicit in a protocol, such as where USD
is implicit. But if the raw numeric value is transmitted without any context, then it has no
definitive interpretation.</p>
<p class="note">Notice that the currency code is completely independent of the end-user's language
or locale. For example, RUR is the code for Russian Rubles. A currency amount of <RUR,
1.23457×10³> would be localized for a Russian user into "1 234,57р." (using U+0440 (р)
<span style="FONT-VARIANT: small-caps">cyrillic small letter er</span>). For an English user it
would be localized into the string "Rub 1,234.57" The end-user's language is needed for doing this
last localization step; but that language is completely orthogonal to the currency code needed in
the data. After all, the same English user could be working with dozens of currencies.Notice also
that the currency code is also independent of whether currency values are inter-converted, which
requires more interesting financial processing: the rate of conversion may depend on a variety of
factors.</p>
<p class="note">Thus logically speaking, once a currency amount is entered into a system, it
should be logically accompanied by a currency code in all processing. This currency code is
independent of whatever the user's original locale was. Only in badly-designed software is the
currency code (or equivalent) not present, so that the software has to "guess" at the currency
code based on the user's locale.</p>
<p class="note"><b>Note: </b>The number of decimal places <b>and</b> the rounding for each
currency is not locale-specific data, and is not contained in the Locale Data Markup Language
format. Those values override whatever is given in the currency numberFormat. For more
information, see <a href="#Supplemental_Data">Supplemental Data</a>.</p>
<p>For background information on currency names, see [CurrencyInfo].</p>
<h3>5.11 <a name="<posix>"><posix></a></h3>
<p><span class="dtd"><!ELEMENT posix (alias | (messages*, special*)) ><br>
<!ELEMENT messages (alias | ( yesstr?, nostr?</span><span class="removedspan">, yesexpr?, noexpr?</span><span class="dtd">))
></span></p>
<p>The following are included for compatibility with POSIX.</p>
<p> <posix><br>
<posix:messages><br>
<posix:yesstr><span style="color: #0000FF">ja</span></posix:yesstr><br>
<posix:nostr><span style="color: #0000FF">nein</span></posix:nostr><br>
<span class="removedspan"> <posix:yesexpr><span style="color: blue">^[Yy].*</span></posix:yesexpr><br>
<posix:noexpr><span style="color: blue">^[Nn].*</span></posix:noexpr><br>
</span> </posix:messages><br>
<posix></p>
<ol>
<li><span>The values for yesstr and nostr contain a colon-separated list of strings that would
normally be recognized as "yes" and "no" responses. For cased languages, this shall include only
the lowercase version. POSIX locale generation tools must generate the uppercase equivalents<span class="changedspan">,
and the abbreviated versions, and add the English words wherever they do not conflict. Examples:</span></span><ul>
<li><span><span class="changedspan">ja <font face="Lucida Sans Unicode">→</font>
ja:Ja:j:J:yes:Yes:y:Y</span></span></li>
<li><span><span class="changedspan">ja <font face="Lucida Sans Unicode">→</font>
ja:Ja:j:J:yes:Yes</span></span><span class="changedspan"> // exclude y:Y if it conflicts with
the native "no".</span></li>
</ul>
</li>
<li><span class="removedspan"><span>Values for yesstr and nostr should include the complete word
for "yes" or "no", as well any commonly used abbreviations for same. </span></span></li>
<li><span><span class="changedspan">The older elements yesexpr and noexpr are deprecated.</span></span><span class="changedspan"> </span>
<span><span class="changedspan">They should instead be generated from yesstr and nostr so that
they match all the responses.</span><span class="removedspan"> The values for yesexpr and noexpr
contain a regular expression that matches only those strings that would be recognized as "yes"
and "no" responses, in any case variation. The value of yesexpr and noexpr should match all the
values in yesstr and nostr respectively.</span></span></li>
</ol>
<p><span>So for English, the appropriate strings and expressions would be as follows:</span></p>
<p><span>yesstr "yes:y"<br>
nostr "no:n"</span></p>
<p><span class="changedspan">The generated yesexpr and noexpr would be:</span></p>
<p><span><code>yesexpr "^([yY]([eE][sS])?)" <br>
</code>This would match y,Y,yes,yeS,yEs,yES,Yes,YeS,YEs,YES.<br>
<br>
<code>noexpr "^([nN][oO]?)"</code><br>
This would match n,N,no,nO,No,NO.</span></p>
<h3><span>5.12 <<a name="references_element">references</a>></span></h3>
<p><span><!ELEMENT references ( reference* ) ><br>
<!ELEMENT reference ( #PCDATA ) ><br>
<!ATTLIST reference type NMTOKEN #REQUIRED><br>
<!ATTLIST reference standard ( true | false ) #IMPLIED ><br>
<!ATTLIST reference uri CDATA #IMPLIED ></span></p>
<p><span>The references section supplies a central location for specifying references and
standards. The uri should be supplied if at all possible. If not online, then a ISBN number should
be supplied, such as in the following example:</span></p>
<p class="example"><span><font size="2"><reference type="R2" uri="http://www.ur.se/nyhetsjournalistik/3lan.html">Landskoder
på Internet</reference><br>
<reference type="R3" uri="URN:ISBN:<span class="removedspan">ISBN </span>
<span class="changedspan">91-</span>47-04974-X">Svenska skrivregler</reference></font></span></p>
<h3>5.1<span>3</span> <a name="<collations>"><collations></a></h3>
<p><span class="dtd"><!ELEMENT collations (alias | (default?, collation*, special*)) ></span></p>
<p>This section contains one or more collation elements, distinguished by type. Each collation
contains rules that specify a certain sort-order, as a tailoring of the UCA table defined in
<a href="http://unicode.org/reports/tr10/">UTS #10: Unicode Collation Algorithm</a> [<a href="#UCA">UCA</a>].
(For a chart view of the UCA, see <a href="http://unicode.org/charts/collation/">Collation Chart</a>
[<a href="#UCAChart">UCAChart</a>].) This syntax is an XMLized version of the Java/ICU syntax.
<span>For illustration, the rules are accompanied by the corresponding <i>basic</i> <i>ICU rule
syntax</i> [<a href="#ICUCollation">ICUCollation</a>] (used in ICU and Java) and/or the ICU
parameterizations, and the basic syntax may be used in examples.</span></p>
<p class="note"><b>Note: </b>ICU provides a concise format for specifying orderings, based on
tailorings to the UCA. For example, to specify that k and q follow 'c', one can use the rule: "& c
< k < q". The rules also allow people to set default general parameter values, such as whether
uppercase is before lowercase or not. (Java contains an earlier version of ICU, and has not been
updated recently. It does not support any of the basic syntax marked with [...], and its default
table is not the UCA.)</p>
<p class="note">However, it is <b>not</b> necessary for ICU to be used in the underlying
implementation. <span>The features are simply related to the ICU capabilities, since that supplies
more detailed examples.</span> <b>Note: </b>there is an on-line demonstration of collation at [<a href="#LocaleExplorer">LocaleExplorer</a>]
(pick the locale and scroll to "Collation Rules").</p>
<h3><a name="Collation_Version">Version</a></h3>
<p>The version attribute is used in case a specific version of the UCA is to be specified. It is
optional, and is specified if the results are to be identical on different systems. If it is not
supplied, then the version is assumed to be the same as the Unicode version for the system as a
whole. <span class="changedspan">In general, tailorings should be defined so as to minimize dependence on the underlying
UCA version, by explicitly specifying the behavior of all characters used to write the
language in question.</span></p>
<blockquote>
<p><i><b>Note: </b>For version 3.1.1 of the UCA, the version of Unicode must also be specified
with any versioning information; an example would be "3.1.1/<span class="changedspan">3.2</span>"
for version 3.1.1 of the UCA, for version 3.2 of Unicode. This has been changed by decision of
the UTC, so that it will no longer be necessary as of UCA 4.0. So for 4.0 and beyond, the
version just has a single number.</i></p>
</blockquote>
<h3>5.13.1 <a name="<collation>"><collation</a>></h3>
<p><span class="dtd"><!ELEMENT collation (alias | (base?, settings?, suppress_contractions?,
optimize?, rules?, special*)) ></span></p>
<p>Like the ICU rules, the tailoring syntax is designed to be independent of the actual weights
used in any particular UCA table. That way the same rules can be applied to UCA versions over
time, even if the underlying weights change. The following describes the overall document
structure of a collation:</p>
<p><code><collation><br>
<settings caseLevel="<span style="color: blue">on</span>"/><br>
<rules><br>
<font color="green"> <!-- rules go here --><br>
</font> </rules><br>
</collation></code></p>
<p><span>The optional base element <code><base><span style="color: blue">...</span></base></code>,
contains an alias element that points to another data source that defines a <i>base </i>collation.
If present, it indicates that the settings and rules in the collation are modifications applied on
<i>top of the</i> respective elements in the base collation. That is, any successive settings,
where present, override what is in the base as described in <a href="#Setting_Options">Setting
Options</a>. Any successive rules are concatenated to the end of the rules in the base. The
results of multiple rules applying to the same characters is covered in <a href="#Orderings">
Orderings</a>.</span></p>
<h3><a name="Setting_Options">Setting Options</a></h3>
<p>In XML, these are attributes of <settings>. For example, <setting strength="secondary"> will
only compare strings based on their primary and secondary weights.</p>
<p>If the attribute is not present, the default (or for the base url's attribute, if there is one)
is used. The default is listed in italics.</p>
<table>
<caption><a name="Collation_Settings">Collation Settings</a></caption>
<tr>
<th>Attribute</th>
<th>Options</th>
<th>Basic Example </th>
<th>XML Example</th>
<th>Description</th>
</tr>
<tr>
<td><font color="#000000">strength</font></td>
<td>primary (1)<br>
secondary (2)<br>
tertiary (3)<br>
<span class="changedspan">quaternary</span> (4)<br>
identical (5)</td>
<td><code>[strength 1]</code></td>
<td><code>strength = "<span style="color: blue">primary</span>"</code></td>
<td>Sets the default strength for comparison, as described in the UCA.</td>
</tr>
<tr>
<td>alternate</td>
<td><i>non-ignorable</i><br>
shifted</td>
<td><code>[alternate non-ignorable]</code></td>
<td><code>alternate = "<span style="color: blue">non-ignorable</span>"</code></td>
<td>Sets alternate handling for variable weights, as described in UCA</td>
</tr>
<tr>
<td>backwards</td>
<td>on<br>
<i>off</i></td>
<td><code>[backwards 2] </code></td>
<td><code>backwards = "<span style="color: blue">on</span>"</code></td>
<td>Sets the comparison for the second level to be backwards ("French"), as described in UCA</td>
</tr>
<tr>
<td>normalization</td>
<td>on<br>
off</td>
<td><code>[normalization on] </code></td>
<td><code>normalization = "<span style="color: blue">off</span>"</code></td>
<td>If <i>on</i>, then the normal UCA algorithm is used. If <i>off</i>, then all strings that
are in [<a href="#FCD">FCD</a>] will sort correctly, but others won't
<span class="changedspan">necessarily sort correctly</span>. So should only be set
<i>off</i> if the the strings to be compared are in FCD.</td>
</tr>
<tr>
<td>caseLevel</td>
<td>on<br>
off</td>
<td><code>[caseLevel on]</code></td>
<td><code>caseLevel = "<span style="color: blue">off</span>"</code></td>
<td>If set to <i>on,</i> a level consisting only of case characteristics will be inserted in
front of tertiary level. To ignore accents but take cases into account, set strength to
primary and case level to <i>on</i>. </td>
</tr>
<tr>
<td>caseFirst</td>
<td>upper<br>
lower<br>
off</td>
<td><code>[caseFirst off]</code></td>
<td><code>caseFirst = "<span style="color: blue">off</span>"</code></td>
<td>If set to <i>upper</i>, causes upper case to sort before lower case. If set to <i>lower</i>,
lower case will sort before upper case. Useful for locales that have already supported
ordering but require different order of cases. Affects case and tertiary levels.</td>
</tr>
<tr>
<td><span class="changedspan">hiraganaQuaternary</span></td>
<td>on<br>
off</td>
<td><code>[hiraganaQ on]</code></td>
<td><code><span class="changedspan">hiraganaQuaternary</span> = "<span style="color: blue">on</span>"</code></td>
<td>Controls special treatment of Hiragana code points on quaternary level. If turned <i>on</i>,
Hiragana codepoints will get lower values than all the other non-variable code points. The
strength must be greater or equal than quaternary if you want this attribute to take effect.</td>
</tr>
<tr>
<td>numeric</td>
<td>on<br>
off</td>
<td><code>[numeric on]</code></td>
<td><code>numeric = "<span style="color: blue">on</span>"</code></td>
<td>If set to <i>on</i>, any sequence of Decimal Digits (General_Category = Nd in the [<a href="#UCD">UCD</a>])
is sorted at a primary level with its numeric value. For example, "A-21" < "A-123".</td>
</tr>
<tr>
<td><span class="changedspan">variableTop</span></td>
<td><span class="changedspan"><i>uXXuYYYY</i></span></td>
<td><span class="changedspan"><code>& \u00XX\uYYYY < [variable top]</code></span></td>
<td><span class="changedspan"><code>variableTop = "</code><code>uXXuYYYY</code><code>"</code></span></td>
<td><span class="changedspan">The parameter value is an encoded Unicode
string, with code points in hex, leading zeros removed, and 'u' inserted
between successive elements.</span><p><span class="changedspan">Sets the
default value for the variable top. All the code points with primary
strengths less than variable top will be considered variable, and thus
affected by the alternate handling.</span></td>
</tr>
</table>
<p> </p>
<h2><a name="Rules">Collation Rule Syntax</a></h2>
<p><span class="dtd"><!ELEMENT rules (alias | ( reset, ( reset | p | pc | s | sc | t | tc | q | qc
| i | ic | x)* )) ></span></p>
<p>The goal for the collation rule syntax is to have clearly expressed rules with a concise
format, that parallels the Basic syntax as much as possible. The rule syntax uses
abbreviated element names for primary (level 1), secondary (level 2), tertiary (level 3), and
identical, to be as short as possible. The reason for this is because the tailorings for CJK
characters are quite large (tens of thousands of elements), and the extra overhead would have been
considerable. Other elements and attributes do not occur as frequently, and have longer names.</p>
<blockquote>
<p><b><i>Note: </i></b>The rules are stated in terms of actions that cause characters to change
their ordering relative to other characters. This is for stability; assigning characters
specific weights would not work, since the exact weight assignment in UCA (or ISO 14651) is not
required for conformance — only the relative ordering of the weights. In addition, stating rules
in terms of relative order is much less sensitive to changes over time in the UCA itself.</p>
</blockquote>
<h3><a name="Orderings">Orderings</a></h3>
<p>The following are the normal ordering actions used for the bulk of characters. Each rule
contains a string of ordered characters that starts with an anchor point or a reset value. The
reset value is an absolute point in the UCA that determines the order of other characters. For
example, the rule & a < g, places "g" after "a" in a tailored UCA: the "a" does not change place.
Logically, subsequent rule after a reset indicates a change to the ordering (and comparison
strength) of the characters in the UCA. For example, the UCA has the following sequence
(abbreviated for illustration):</p>
<p>... a <<sub>3</sub> a <<sub>3</sub> ⓐ <<sub>3</sub> A <<sub>3</sub> A <<sub>3</sub> Ⓐ <<sub>3</sub>
ª <<sub>2</sub> á <<sub>3</sub> Á <<sub>1</sub> æ <<sub>3</sub> Æ <<sub>1</sub> ɐ <<sub>1</sub> ɑ
<<sub>1</sub> ɒ <<sub>1</sub> b <<sub>3</sub> b <<sub>3</sub> ⓑ <<sub>3</sub> B <<sub>3</sub> B <<sub>3</sub>
ℬ ...</p>
<p>Whenever a character is inserted into the UCA sequence, it is inserted at the first point where
the strength difference will not disturb the other characters in the UCA. For example, & a < g
puts <i>g</i> in the above sequence with a strength of L1. Thus the <i>g</i> must go in after any
lower strengths, as follows:</p>
<p>... a <<sub>3</sub> a <<sub>3</sub> ⓐ <<sub>3</sub> A <<sub>3</sub> A <<sub>3</sub> Ⓐ <<sub>3</sub>
ª <<sub>2</sub> á <<sub>3</sub> Á <b><font color="red"><<sub>1</sub> g </font></b><<sub>1</sub> æ
<<sub>3</sub> Æ <<sub>1</sub> ɐ <<sub>1</sub> ɑ <<sub>1</sub> ɒ <<sub>1</sub> b <<sub>3</sub> b <<sub>3</sub>
ⓑ <<sub>3</sub> B <<sub>3</sub> B <<sub>3</sub> ℬ ...</p>
<p>The rule & a << g, which uses a level-2 strength, would produce the following sequence:</p>
<p>... a <<sub>3</sub> a <<sub>3</sub> ⓐ <<sub>3</sub> A <<sub>3</sub> A <<sub>3</sub> Ⓐ <<sub>3</sub>
ª <b><font color="red"><<sub>2</sub> g</font></b> <<sub>2</sub> á <<sub>3</sub> Á<b><font color="red">
</font></b><<sub>1</sub> æ <<sub>3</sub> Æ <<sub>1</sub> ɐ <<sub>1</sub> ɑ <<sub>1</sub> ɒ <<sub>1</sub>
b <<sub>3</sub> b <<sub>3</sub> ⓑ <<sub>3</sub> B <<sub>3</sub> B <<sub>3</sub> ℬ ...</p>
<p>And the rule & a <<< g, which uses a level-3 strength, would produce the following sequence:</p>
<p>... a <b><font color="red"><<sub>3</sub> g</font></b> <<sub>3</sub> a <<sub>3</sub> ⓐ <<sub>3</sub>
A <<sub>3</sub> A <<sub>3</sub> Ⓐ <<sub>3</sub> ª <<sub>2</sub> á <<sub>3</sub> Á<b><font color="red">
</font></b><<sub>1</sub> æ <<sub>3</sub> Æ <<sub>1</sub> ɐ <<sub>1</sub> ɑ <<sub>1</sub> ɒ <<sub>1</sub>
b <<sub>3</sub> b <<sub>3</sub> ⓑ <<sub>3</sub> B <<sub>3</sub> B <<sub>3</sub> ℬ ...</p>
<p>Since resets always work on the existing state, the rule entries must be in the proper order. A
character or sequence may occur multiple times; each subsequent occurrence causes a different
change. The following shows the result of serially applying a three rules.</p>
<table>
<tr>
<th> </th>
<th>Rules </th>
<th>Result</th>
<th>Comment </th>
</tr>
<tr>
<td>1</td>
<td>& a < g</td>
<td>... a<font color="red"> <<sub>1</sub> g</font> ...</td>
<td>Put g after a.</td>
</tr>
<tr>
<td>2</td>
<td>& a < h < k</td>
<td>... a<font color="red"> <<sub>1</sub> h <<sub>1</sub> k</font> <<sub>1</sub> g ...</td>
<td>Now put h and k after a (inserting before the g).</td>
</tr>
<tr>
<td>3</td>
<td>& h << g</td>
<td>... a <<sub>1</sub> h<font color="red"> <<sub>1</sub> g</font> <<sub>1</sub> k ...</td>
<td>Now put g after h (inserting before k).</td>
</tr>
</table>
<p>Notice that characters can occur multiple times, and thus override previous rules.</p>
<p><span>Except for the case of expansion sequence syntax, every sequence after a reset is
equivalent in action to breaking up the sequence into an <i>atomic</i> rule: a reset + relation
pair. The tailoring is then equivalent to applying each of the atomic rules to the UCA in order,
according to the above description.</span></p>
<p><span><i>Example:</i></span></p>
<table>
<tr>
<th><span>Rules</span></th>
<th><span>Equivalent Atomic Rules</span></th>
</tr>
<tr>
<td><span>& b < q <<< Q<br>
& a < x <<< X << q <<< Q < z</span></td>
<td><span>& b < q<br>
& q <<< Q<br>
& a < x<br>
& x <<< X<br>
& X << q<br>
& q <<< Q<br>
& Q < z</span></td>
</tr>
</table>
<p><span>In the case of expansion sequence syntax, the equivalent atomic sequence can be derived
by first transforming the expansion sequence syntax into normal expansion syntax. (See
<a href="#Expansions">Expansions</a>.)</span></p>
<p><span class="dtd"><!ELEMENT reset ( #PCDATA | cp | ... )* ><br>
<!ELEMENT p ( #PCDATA | cp | last_variable )* ><br>
</span><span>(Elements pc, s, sc, t, tc, q, qc, i, and ic have the same structure as p.)</span></p>
<table>
<caption>Specifying Collation Ordering</caption>
<tr>
<th>Basic Symbol</th>
<th>Basic Example</th>
<th>XML Symbol</th>
<th>XML Example</th>
<th>Description</th>
</tr>
<tr>
<td><code>& </code></td>
<td><code>& Z </code></td>
<td><code><reset></code></td>
<td><code><reset><span style="color: blue">Z</span></reset></code></td>
<td>Don't change the ordering of Z, but place subsequent characters relative to it.</td>
</tr>
<tr>
<td><code>< </code></td>
<td><code>& a<br>
< b </code></td>
<td><code><p></code></td>
<td><code><reset><span style="color: blue">a</span><reset><br>
<p><span style="color: blue">b</span></p></code></td>
<td>Make 'b' sort after 'a', as a <i>primary</i> (base-character) difference</td>
</tr>
<tr>
<td><code><< </code></td>
<td><code>& a<br>
<< ä </code></td>
<td><code><s></code></td>
<td><code><reset><span style="color: blue">a</span><reset><br>
<s><span style="color: blue">ä</span></s></code></td>
<td>Make 'ä' sort after 'a' as a <i>secondary</i> (accent) difference</td>
</tr>
<tr>
<td><code><<< </code></td>
<td><code>& a<br>
<<< A </code></td>
<td><code><t></code></td>
<td><code><reset><span style="color: blue">a</span><reset><br>
<t><span style="color: blue">A</span></t></code></td>
<td>Make 'A' sort after 'a' as a <i>tertiary</i> (case/variant) difference</td>
</tr>
<tr>
<td><code>= </code></td>
<td><code>& x<br>
= y </code></td>
<td><code><i></code></td>
<td><code><reset><span style="color: blue">v</span><reset><br>
<i><span style="color: blue">w</span></i></code></td>
<td>Make 'w' sort <i>identically</i> to 'v'</td>
</tr>
</table>
<p>Resets only need to be at the start of a sequence, to position the characters relative a
character that is in the UCA (or has already occurred in the tailoring). For example:
<reset>z</reset><p>a</p><p>b</p><p>c</p><p>d</p>.</p>
<p>Some additional elements are provided to save space with large tailorings. The addition of a
'c' to the element name indicates that each of the characters in the contents of that element are
to be handled as if they were separate elements with the corresponding strength:</p>
<table>
<caption>Abbreviating Ordering Specifications</caption>
<tr>
<th>XML Symbol</th>
<th>XML Example</th>
<th>Equivalent</th>
</tr>
<tr>
<td><code><pc></code></td>
<td><code><pc><span style="color: blue">bcd</span></pc></code></td>
<td><code><p><span style="color: blue">b</span></p><p><span style="color: blue">c</span></p><p><span style="color: blue">d</span></p></code></td>
</tr>
<tr>
<td><code><sc></code></td>
<td><code><sc><span style="color: blue">àáâã</span></sc></code></td>
<td><code><s><span style="color: blue">à</span></s><s><span style="color: blue">á</span></s><s><span style="color: blue">â</span></s><s>ã</s></code></td>
</tr>
<tr>
<td><code><tc></code></td>
<td><code><tc><span style="color: blue">PpP</span></tc></code></td>
<td><code><t><span style="color: blue">P</span></t><t><span style="color: blue">p</span></t><t><span style="color: blue">P</span></t></code></td>
</tr>
<tr>
<td><code><ic></code></td>
<td><code><ic><span style="color: blue">VwW</span></ic></code></td>
<td><code><i><span style="color: blue">V</span></i><i><span style="color: blue">w</span></i><i><span style="color: blue">W</span></i></code></td>
</tr>
</table>
<h3><a name="Contractions">Contractions</a></h3>
<p>To sort a sequence as a single item (contraction), just use the sequence, e.g.</p>
<table>
<caption>Specifying Contractions</caption>
<tr>
<th>BASIC Example</th>
<th>XML Example</th>
<th>Description</th>
</tr>
<tr>
<td><code>& k<br>
< ch</code></td>
<td><code><reset><span style="color: blue">k</span></reset><br>
<p><span style="color: blue">ch</span></p></code></td>
<td>Make the sequence 'ch' sort after 'k', as a primary (base-character) difference</td>
</tr>
</table>
<h3><a name="Expansions">Expansions</a></h3>
<p><span class="dtd"><!ELEMENT x (context?, ( p | pc | s | sc | t | tc | q | qc | i | ic )*,
extend? ) ></span></p>
<p>There are two ways to handle expansions (where a character sorts as a sequence) with both the
basic syntax and the XML syntax. The first method is to reset to the sequence of characters. <span>
This is called <i>sequence expansion syntax. </i></span>The second is to use the extension
sequence. Both are equivalent in practice (unless the reset sequence happens to be a contraction).
<span>This is called <i>normal expansion syntax</i>.</span></p>
<table>
<caption>Specifying Expansions</caption>
<tr>
<th>Basic</th>
<th>XML</th>
<th>Description</th>
</tr>
<tr>
<td><code>& c <br>
<<span><</span> k / h</code></td>
<td><code><reset><span style="color: blue">c</span></reset><br>
<x><<span>s</span>><span style="color: blue">k</span></<span>s</span>> <extend><span style="color: blue">h</span></extend></x></code></td>
<td><span><i>normal expansion syntax:<br>
</i></span>Make 'k' sort after the sequence 'ch'; thus 'k' will behave as if it expands to a
character after 'c' followed by an 'h'.</td>
</tr>
<tr>
<td><code>& ch<br>
<<span><</span> k</code></td>
<td><code><reset><span style="color: blue">ch</span></reset><br>
<<span>s</span>><span style="color: blue">k</span></<span>s</span>></code></td>
<td><span><i>sequence expansion syntax:<br>
</i></span>Make 'k' sort after the sequence 'ch'; thus 'k' will behave as if it expands to a
character after 'c' followed by an 'h'.
<p><i>(unless 'ch' is defined beforehand as a contraction).</i></td>
</tr>
</table>
<p>If an <code><extend></code> element is necessary, it requires the rule to be embedded in an <x>
element.</p>
<p><span>The sequence expansion syntax can be quite tricky, so it should be avoided where
possible. In particular:</span></p>
<ul>
<li><span>The expansion is <i>only</i> in effect up to — but not including — the first primary
rule. Thus<br>
<code> <reset><span style="COLOR: blue">ch</span></reset><br>
<s><span style="color: blue">x</span></x><br>
<t><span style="color: blue">y</span></t><br>
<p><span style="color: blue">z</span></p><br>
</code>is the same as<br>
<code> <reset><span style="COLOR: blue">c</span></reset><br>
<x><s><span style="color: blue">x</span></s><extend><span style="COLOR: blue">h</span></extend></x><br>
<x><t><span style="color: blue">y</span></t><extend><span style="COLOR: blue">h</span></extend></x><br>
<p><span style="color: blue">z</span></p></code></span></li>
<li><span>In accordance with the UCA, all strings are interpreted as being in NFD form. In other
rules, this has no effect, but syntax such as <code><reset></code><b>ä</b><code></reset></code>,
the <b>ä</b> will be treated as two characters <b>a + ¨</b>, <i>unless</i> the <b>ä</b>
has previously been used as a contraction. Thus the <b>¨</b> will be used as an expansion for
following characters (up to the next primary).</span></li>
</ul>
<p>Each extension replaces the one before it; it does not append to it. So </p>
<p>& ab << c<br>
& cd << e</p>
<p>is equivalent to:</p>
<p>& a << c / b << e / d</p>
<p>and produces the following weights (where <i>p(x)</i> is the primary weight and <i>s(a)</i> is
the secondary weight):</p>
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse">
<tr>
<td>Character</td>
<td>Weights</td>
</tr>
<tr>
<td>c</td>
<td>p(a), p(b); s(a)+1, s(b); ...</td>
</tr>
<tr>
<td>e</td>
<td>p(a), p(d); s(a)+2, s(d); ...</td>
</tr>
</table>
<p><span>When expressing rules as atomic rules, the sequences must first be transformed into
normal expansion syntax:</span></p>
<table>
<tr>
<th><span>Expansion Sequence</span></th>
<th><span>Normal Expansion</span></th>
<th><span>Equivalent Atomic Rules</span></th>
</tr>
<tr>
<td><span>& a<u>b</u> << q <<< Q<br>
& a<u>d</u> <<< AD < x <<< X</span></td>
<td><span>& a << q <u>/ b</u> <<< Q <u>/ b</u><br>
& a <<< AD <u>/ d</u> < x <<< X</span></td>
<td><span>& b << q <u>/ b</u><br>
& q <<< Q <u>/ b</u><br>
& a < AD <u>/ d</u><br>
& AD < x<br>
& x<<< X</span></td>
</tr>
</table>
<h3><a name="Context_Before">Context Before</a></h3>
<p>The context before a character can affect how it is ordered, such as in
Japanese. This could be expressed with a combination of contractions and
expansions, but is faster using a context. (The actual weights produced are
different, but the resulting string comparisons are the same.) If a context
element occurs, it must be the first item in the rule<span class="changed">,
and requires an <x> element.</span></p>
<p><span class="changed">For example, suppose that "-" is sorted like the
previous vowel. Then one could have rules that take "a-", "e-", and so on.
However, that means that every time a very common character (a, e, ...) is
encountered, a system will slow down as it looks for possible contractions.
An alternative is to indicate that when "-" is encountered,<i> </i>and it
comes after an 'a', it sorts like an 'a', etc. </span></p>
<table>
<caption>Specifying Previous Context</caption>
<tr>
<th>Basic</th>
<th>XML</th>
</tr>
<tr>
<td><span class="changed"><code>& a <<< a | - <br>
& e <<< e | - <br>
...</code></span></td>
<td><span class="changed"><code><reset><span style="color: #0000FF">a</span></reset><x><context><span style="color: #0000FF">a</span></context><s><span style="color: #0000FF">-</span></s></x><br>
<reset><span style="color: #0000FF">e</span></reset><x><context><span style="color: #0000FF">e</span></context><s><span style="color: #0000FF">-</span></s></x><br>
...</code></span></td>
</tr>
</table>
<p><span class="changed">Both the context and extend elements can occur in an
<x> element.</span> <span class="removedspan">If an <code><extend></code> element is necessary, it requires the rule to be embedded in an <x>
element. There can also be a <code><context></code> at the same time.
</span>For example, the following
are allowed:</p>
<ul>
<li><code><x><context><span style="color: blue">abc</span></context><p><span style="color: blue">def</span></p><extend><span style="color: blue">ghi</span></extend></x></code></li>
<li><code><x><p><span style="color: blue">def</span></p><extend><span style="color: blue">ghi</span></extend></x></code></li>
<li><code><x><context><span style="color: blue">abc</span></context><p><span style="color: blue">def</span></p></x></code></li>
</ul>
<h3><a name="Placing_Characters_Before_Others">Placing Characters Before Others</a></h3>
<p>There are certain circumstances where characters need to be placed before a given character,
rather than after. This is the case with Pinyin, for example, where certain accented letters are
positioned before the base letter. That is accomplished with the following syntax.</p>
<table>
<caption>Placing Characters <i>Before</i> Others</caption>
<tr>
<th>Item</th>
<th>Options</th>
<th>Basic Example </th>
<th>XML Example</th>
</tr>
<tr>
<td>before </td>
<td>primary<br>
secondary<br>
tertiary</td>
<td><code>& [before 2] a<br>
<< à</code></td>
<td><code><reset before="<span style="color: blue; background-color: #00FF00">secondary</span>"><span style="color: blue">a</span></reset><br>
<s><span style="color: blue">à</span></s></code></td>
</tr>
</table>
<p><span>It is an error if the strength of the before relation is not identical to the relation
after the reset. Thus the following are errors, since the value of the <i>before</i> attribute
does not agree with the relation <s>.</span></p>
<table>
<tr>
<th><span>Basic Example </span></th>
<th><span>XML Example</span></th>
<th></th>
</tr>
<tr>
<td><code><span>& [before 2] a<br>
< à</span></code></td>
<td><code><span><reset before="<span style="color: blue; background-color: #00FF00">primary</span>"><span style="color: blue">a</span></reset><br>
<s><span style="color: blue">à</span></s></span></code></td>
<td><code><span>Error</span></code></td>
</tr>
<tr>
<td><code><span>& [before 2] a<br>
<<< à</span></code></td>
<td><code><span><reset before="<span style="color: blue; background-color: #00FF00">tertiary</span>"><span style="color: blue">a</span></reset><br>
<s><span style="color: blue">à</span></s></span></code></td>
<td><code><span>Error</span></code></td>
</tr>
</table>
<h3><a name="Logical_Reset_Positions">Logical Reset Positions</a></h3>
<p><span class="dtd"><!ELEMENT reset ( ... | first_variable| last_variable |
first_tertiary_ignorable | last_tertiary_ignorable | first_secondary_ignorable |
last_secondary_ignorable | first_primary_ignorable | last_primary_ignorable | first_non_ignorable
| last_non_ignorable | first_trailing | last_trailing )* ></span></p>
<p>The UCA has the following overall structure for weights, going from low to high.</p>
<table>
<caption>Specifying Logical Positions</caption>
<tr>
<th>Name</th>
<th>Description</th>
<th>UCA Examples</th>
</tr>
<tr>
<td>first tertiary ignorable<br>
...<br>
last tertiary ignorable</td>
<td>p, s, t = ignore</td>
<td>Control Codes<br>
Format Characters<br>
Hebrew Points<br>
Tibetan Signs<br>
...</td>
</tr>
<tr>
<td>first secondary ignorable<br>
...<br>
last secondary ignorable</td>
<td>p, s = ignore</td>
<td>None in UCA</td>
</tr>
<tr>
<td>first primary ignorable<br>
...<br>
last primary ignorable</td>
<td>p = ignore</td>
<td>Most combining marks</td>
</tr>
<tr>
<td>first variable<br>
...<br>
last variable</td>
<td><i><b>if</b> alternate = non-ignorable<br>
</i>p != ignore,<br>
<i><b>if</b> alternate = shifted</i><br>
p, s, t = ignore</td>
<td>Whitespace,<br>
Punctuation,<br>
Symbols</td>
</tr>
<tr>
<td>first non-ignorable<br>
...<br>
last non-ignorable</td>
<td>p != ignore</td>
<td>Small number of exceptional symbols<br>
[e.g. U+02D0 MODIFIER LETTER TRIANGULAR COLON]<br>
Numbers<br>
Latin<br>
Greek<br>
...</td>
</tr>
<tr>
<td><i>implicits</i></td>
<td>p != ignore, assigned automatically</td>
<td>CJK, CJK compatibility (those that are not decomposed)<br>
CJK Extension A, B<br>
Unassigned</td>
</tr>
<tr>
<td>first trailing<br>
...<br>
last trailing</td>
<td>p != ignore,<br>
used for trailing syllable components</td>
<td>Jamo Trailing<br>
Jamo Leading</td>
</tr>
</table>
<p>Each of the above Names (except <i>implicits</i>) can be used with a reset to position
characters relative to that logical position. That allows characters to be ordered before or after
a <i>logical</i> position rather than a specific character.</p>
<p class="note"><b><i>Note: </i></b>The reason for this is so that tailorings can be more stable.
A future version of the UCA might add characters at any point in the above list. Suppose that you
set character X to be after Y. It could be that you want X to come after Y, no matter what future
characters are added; or it could be that you just want Y to come after a given logical position,
e.g. after the last primary ignorable.</p>
<p>Here is an example of the syntax:</p>
<table>
<caption>Sample Logical Position</caption>
<tr>
<th>Basic</th>
<th>XML</th>
</tr>
<tr>
<td><code>& [first tertiary ignorable]<br>
<< à </code></td>
<td><code><reset><first_tertiary_ignorable/></reset><br>
<s><span style="color: blue">à</span></s></code></td>
</tr>
</table>
<p>For example, to make a character be a secondary ignorable, one can make it be immediately after
(at a secondary level) a specific character (like a combining dieresis), or one can make it be
immediately after the last secondary ignorable.</p>
<p>The <i>last-variable</i> element indicates the "highest" character that is treated as
punctuation with alternate handling. Unlike the other logical positions, it can be reset as well
as referenced. For example, it can be reset to be just above spaces if all visible punctuation are
to be treated as having distinct primary values.</p>
<table>
<caption>Specifying Last-Variable</caption>
<tr>
<th>Attribute</th>
<th>Options</th>
<th>Basic Example </th>
<th>XML Example</th>
</tr>
<tr>
<td rowspan="3">variableTop</td>
<td><font color="#000000">at</font></td>
<td><code>& x<br>
= [last variable]</code></td>
<td><code><reset><span style="color: blue">x</span></reset><br>
<i><last_variable/></i></code></td>
</tr>
<tr>
<td><font color="#000000">after</font></td>
<td><code>& x<br>
< [last variable]</code></td>
<td><code><reset><span style="color: blue">x</span></reset><br>
<p><last_variable/></p></code></td>
</tr>
<tr>
<td><font color="#000000">before</font></td>
<td><code>& [before 1] x<br>
< [last variable]</code></td>
<td><code><reset before="<span style="color: blue">primary</span>"><span style="color: blue">x</span></reset><br>
<p><last_variable/></p></code></td>
</tr>
</table>
<p>The default value for <i>variable-top</i> depends on the UCA setting. For example, in 3.1.1,
the value is at:</p>
<blockquote>
<p>U+1D7C3 MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL</p>
</blockquote>
<p>The <code><last_variable/></code> cannot occur inside an <x> element, nor can there be any
element content. Thus there can be no <context> or <extend> or text data in the rule. For example,
the following are all disallowed:</p>
<ul>
<li><code><x><context><span style="color: blue">a</span></context><p><last_variable/></p></x></code></li>
<li><code><x><p><last_variable/></p><extend><span style="color: blue">a</span></extend></x></code></li>
<li><code><p><last_variable/><span style="color: blue">a</span></p></code></li>
<li><code><p><span style="color: blue">a</span><last_variable/></p></code></li>
</ul>
<h3><a name="Special_Purpose_Commands">Special-Purpose Commands</a></h3>
<p>The <i>suppress contractions</i> tailoring command turns off any existing contractions that
begin with those characters. It is typically used to turn off the Cyrillic contractions in the
UCA, since they are not used in many languages and have a considerable performance penalty. The
argument is a <a href="#Unicode_Sets">Unicode Set</a>.</p>
<p>The <i>optimize</i> tailoring command is purely for performance. It indicates that those
characters are sufficiently common in the target language for the tailoring that their performance
should be enhanced.</p>
<table>
<caption>Special-Purpose Commands</caption>
<tr>
<th>Basic</th>
<th>XML</th>
</tr>
<tr>
<td>[suppress contractions [Љ-ґ]]</td>
<td><code><suppress_contractions></code><span style="color: blue">[Љ-ґ]</span><code></suppress_contractions></code></td>
</tr>
<tr>
<td>[optimize [Ά-ώ]]</td>
<td><code><optimize></code><span style="color: blue">[Ά-ώ]</span><code></optimize></code></td>
</tr>
</table>
<p><br>
The reason that these are not settings is so that their contents can be arbitrary characters. </p>
<hr width="50%">
<p class="example">Example Collation</p>
<p class="example">The following is a simple example that takes portions of the Swedish tailoring
plus part of a Japanese tailoring, for illustration. For more complete examples, see the actual
locale data: Japanese, Chinese, Swedish, Traditional German are particularly illustrative.</p>
<pre><collation version="<span style="color: blue">3.1.1</span>">
<settings caseLevel="<span style="color: blue">on</span>"/>
<rules>
<reset><span style="color: blue">Z</span></reset>
<p><span style="color: blue">æ</span></p>
<t><span style="color: blue">Æ</span></t>
<p><span style="color: blue">å</span></p>
<t><span style="color: blue">Å</span></t>
<t><span style="color: blue">aa</span></t>
<t><span style="color: blue">aA</span></t>
<t><span style="color: blue">Aa</span></t>
<t><span style="color: blue">AA</span></t>
<p><span style="color: blue">ä</span></p>
<t><span style="color: blue">Ä</span></t>
<p><span style="color: blue">ö</span></p>
<t><span style="color: blue">Ö</span></t>
<s><span style="color: blue">ű</span></s>
<t><span style="color: blue">Ű</span></t>
<p><span style="color: blue">ő</span></p>
<t><span style="color: blue">Ő</span></t>
<s><span style="color: blue">ø</span></s>
<t><span style="color: blue">Ø</span></t>
<reset><span style="color: blue">V</span></reset>
<tc><span style="color: blue">wW</span></tc>
<reset><span style="color: blue">Y</span></reset>
<tc><span style="color: blue">üÜ</span></tc>
<reset><last_non_ignorable/></reset>
<span style="color:green"> <!-- following is equivalent to <p>亜</p><p>唖</p><p>娃</p>... -->
</span> <pc><span style="color: blue">亜唖娃阿哀愛挨姶逢葵茜穐悪握渥旭葦芦</span></pc>
<pc><span style="color: blue">鯵梓圧斡扱</span></pc>
</rules>
</collation></pre>
<h3><span class="changedspan">5.14 <span style="background-color: #FFFF00">
<a name="Segmentations">Segmentations</a></span></span></h3>
<p><span class="changedspan">The segmentations element provides for segmentation of text into
words, lines, or other segments. The structure is based on [<a href="#UAX29">UAX29</a>] notation,
but adapted to be machine-readable. It uses a list of variables (representing character classes)
and a list of rules. Each must have an id attribute.</span></p>
<p><span class="changedspan">The rules in <i>root</i> implement the segmentations found in [<a href="#UAX29">UAX29</a>]
and [<a href="#UAX14">UAX14</a>], for grapheme clusters, words, sentences, and lines. They can be
overriden by rules in child locales. </span></p>
<p><span class="changedspan">Here is an example:</span></p>
<pre><span class="changedspan"><segmentations>
<segmentation type="GraphemeClusterBreak">
<variables>
<variable id="$CR">\p{Grapheme_Cluster_Break=CR}</variable>
<variable id="$LF">\p{Grapheme_Cluster_Break=LF}</variable>
<variable id="$Control">\p{Grapheme_Cluster_Break=Control}</variable>
<variable id="$Extend">\p{Grapheme_Cluster_Break=Extend}</variable>
<variable id="$L">\p{Grapheme_Cluster_Break=L}</variable>
<variable id="$V">\p{Grapheme_Cluster_Break=V}</variable>
<variable id="$T">\p{Grapheme_Cluster_Break=T}</variable>
<variable id="$LV">\p{Grapheme_Cluster_Break=LV}</variable>
<variable id="$LVT">\p{Grapheme_Cluster_Break=LVT}</variable>
</variables>
<segmentRules>
<rule id="3"> $CR × $LF </rule>
<rule id="4"> ( $Control | $CR | $LF ) ÷ </rule>
<rule id="5"> ÷ ( $Control | $CR | $LF ) </rule>
<rule id="6"> $L × ( $L | $V | $LV | $LVT ) </rule>
<rule id="7"> ( $LV | $V ) × ( $V | $T ) </rule>
<rule id="8"> ( $LVT | $T) × $T </rule>
<rule id="9"> × $Extend </rule>
</segmentRules>
</segmentation>
...</span></pre>
<p><span class="changedspan"><b>Variables: </b>All variable ids must start with a $, and otherwise
be valid identifiers according to the Unicode definitions in [<a href="#UAX31">UAX31</a>]. The
contents of a variable is a regular expression using variables and <a href="#Unicode_Sets">
UnicodeSet</a>s. The ordering of variables is important; they are evaluated in order from first to
last (see Inheritance, below). It is an error to use a variable before it is defined.</span></p>
<p><span class="changedspan"><b>Rules: </b>The contents of a rule uses the syntax of [<a href="#UAX29">UAX29</a>].
The rules are evaluated in numeric id order (which may not be the order in which the appear in the
file). The first rule that matches determines the status of a boundary position, that is, whether
it breaks or not. Thus ÷ means a break is allowed; × means a break is forbidden. It is an error if
the rule does not contain exactly one of these characters (except where a rule has no contents at
all, or if the rule uses a variable that has not been defined.</span></p>
<p><span class="changedspan">There are some implicit rules: </span></p>
<ul>
<li><span class="changedspan">The implicit initial rules are always "start-of-text ÷" and "÷
end-of-text"; these are not to be included explicitly.</span></li>
<li><span class="changedspan">The implicit final rule is always "Any ÷ Any". This is not to be
included explicitly.</span></li>
</ul>
<blockquote>
<p><span class="changedspan"><b>Note: </b>A rule like X Format* -> X in [<a href="#UAX29">UAX29</a>]
and [<a href="#UAX14">UAX14</a>] is not supported. Instead, this needs to be expressed as normal
regular expressions. The normal way to support this is to modify the variables, such as in the
following example:</span></p>
<pre id="line870"><span class="changedspan"><variable id="$Format">\p{Word_Break=Format}</variable>
<variable id="$Katakana">\p{Word_Break=Katakana}</variable>
...
<!-- In place of rule 3, add format and extend to everything -->
<variable id="$X">[$Format $Extend]*</variable>
<variable id="$Katakana">($Katakana $X)</variable>
<variable id="$ALetter">($ALetter $X)</variable>
...</span></pre>
</blockquote>
<h4><span class="changedspan">5.14.1 Inheritance</span></h4>
<p><span class="changedspan">Variables and rules both inherit from the parent. </span></p>
<p><span class="changedspan"><b>Variables: </b>The child's variable list is logically appended to
the parent's, and evaluated in that order. For example:</span></p>
<p><span class="changedspan"><font color="#0000FF">// in parent</font><br>
<variable id="$AL">[:linebreak=AL:]</variable><br>
<variable id="$YY">[[:linebreak=XX:]$AL]</variable> <font color="#0000FF">// adds $AL</font></span></p>
<p><span class="changedspan"><font color="#0000FF">// in child</font><br>
<variable id="$AL">[$AL && [^a-z]]</variable> <font color="#0000FF">// changes $AL, doesn't affect
$YY</font><br>
<variable id="$ABC">[abc]</variable> <font color="#0000FF">// adds new rule</font></span></p>
<p><span class="changedspan"><b>Rules:</b> The rules are also logically appended to the parent's.
Because rules are evaluated in numeric id order, to insert a rule in between others just requires
using an intermediate number. For example, to insert a rule before id="10.1" and after id="10.2",
just use id="10.15". To delete a rule, use empty contents, such as:</span></p>
<p><span class="changedspan"><rule id="3"/><font color="#0000FF"> // deletes rule 3</font></span></p>
<h2><span class="changedspan">5.15 <a name="Transforms">Transforms</a></span></h2>
<p><span class="changedspan">Transforms provide a set of rules for transforming text via a
specialized set of context-sensitive matching rules. They are commonly used for transliterations
or transcriptions, but also other transformations such as full-width to half-width (for <i>
katakana</i> characters). The rules can be simple one-to-one relationships
between characters, or involve more complicated mappings. Here is an example:</span></p>
<pre><span class="changedspan"><span style="background-color: #FFFF00"><transform source="Greek" target="Latin" variant="UNGEGN" direction="both">
...
<comment>Useful variables</comment>
<tRule>$gammaLike = [ΓΚΞΧγκξχϰ] ;</tRule>
<tRule>$egammaLike = [GKXCgkxc] ;</tRule>
...
<comment>Rules are predicated on running NFD first, and NFC afterwards</comment>
<tRule>::NFD (NFC) ;</tRule>
...
<tRule>λ ↔ l ;</tRule>
<tRule>Λ ↔ L ;</tRule>
...
<tRule>γ } $gammaLike ↔ n } $egammaLike ;</tRule>
<tRule>γ ↔ g ;</tRule>
...
<tRule>::NFC (NFD) ;</tRule>
...</span>
</transform></span></pre>
<p><span class="changedspan">The source and target values are valid locale identifiers, where
'und' means an unspecified language, plus some additional extensions.</span></p>
<ul>
<li><span class="changedspan">The long names of a script according to [<a href="#Scripts">Scripts</a>] can also be used instead of the und_script codes.</span></li>
<li><span class="changed">Other identifiers may be used for special
purposes. In CLDR, these include: Accents, Digit, Fullwidth, Halfwidth,
Jamo, NumericPinyin, Pinyin, Publishing, Tone. (Other than these values,
valid private use locale identifiers should be used, such as
"x-Special".</span></li>
</ul>
<p><span class="changedspan">There is currently one variant used in C</span><span class="changed">LDR:
UNGEGN. </span><span class="changedspan">There is an additional attribute <code>private="true" </code>which is used to
indicate that the transform is used in other transforms, but should not be
listed when presented to users.</span></p>
<p><span class="changedspan">There are many different systems of transliteration. The goal for the
"unqualified" script transliterations are</span></p>
<ol>
<li><span class="changedspan">to be lossless when going to Latin and back</span></li>
<li><span class="changedspan">to be as lossless as possible when going to other scripts</span></li>
<li><span class="changedspan">to abide by a common standard as much as possible (possibly
supplemented to meet goals 1 and 2).</span></li>
</ol>
<p><span class="changedspan">Additional transliterations may also be defined, such as customized
language-specific transliterations (such as between Russian and French), or those that match a
particular transliteration standard, such as </span></p>
<ul>
<li><span class="changedspan">UNGEGN - UNITED NATIONS GROUP OF EXPERTS ON GEOGRAPHICAL NAMES</span></li>
<li><span class="changedspan">ISO9 - ISO/IEC 9</span></li>
<li><span class="changedspan">ISO15915 - ISO/IEC 15915</span></li>
<li><span class="changedspan">ISCII91 - ISCII 91</span></li>
<li><span class="changedspan">KMOCT - South Korean Ministry of Culture & Tourism</span></li>
<li><span class="changedspan">USLC - US Library of Congress</span></li>
<li><span class="changedspan">UKPCGN - Permanent Committee on Geographical Names for British
Official Use</span></li>
<li><span class="changedspan">RUGOST - Russian Main Administration of Geodesy and Cartography</span></li>
</ul>
<p><span class="changedspan">The rules for transforms are described in </span>
<span style="background-color: #FFFF00"><span class="changedspan">Appendix N:
<a href="#Transform_Rules">Transform Rules</a>.</span></span></p>
<hr>
<h2>Appendix A: <a name="Sample_Special_Elements">Sample Special Elements</a></h2>
<p>The elements in this section are <i><b>not</b></i> part of the Locale Data Markup Language 1.0
specification. Instead, they are special elements used for application-specific data to be stored
in the Common Locale Repository. <span>They may change or be removed future versions of this
document, and are present her more as examples of how to extend the format.</span> (Some of these
items may move into a future version of the Locale Data Markup Language specification.)</p>
<ul>
<li><a href="http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd">
http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</a></li>
<li><a href="http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd">
http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</a></li>
</ul>
<p><span class="changedspan">The above examples are old versions: consult the documentation for
the specific application to see which should be used.</span></p>
<p>These DTDs use namespaces and the special element. To include one or more, use the following
pattern to import the special DTDs that are used in the file:</p>
<pre><?xml version="<span style="color: blue">1.0</span>" encoding="<span style="color: blue">UTF-8</span>" ?>
<!DOCTYPE ldml SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldml.dtd</span>" [
<!ENTITY % <span style="color: blue">icu</span> SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>">
<!ENTITY % <span style="color: blue">openOffice</span> SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</span>">
<span style="color: blue">%icu;
%openOffice;
</span>]></pre>
<p>Thus to include just the ICU DTD, one uses:</p>
<pre><?xml version="<span style="color: blue">1.0</span>" encoding="<span style="color: blue">UTF-8</span>" ?>
<!DOCTYPE ldml SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldml.dtd</span>" [
<!ENTITY % icu SYSTEM "<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>">
<span style="color: blue">%icu;
</span>]></pre>
<blockquote>
<p><span><b>Note: </b>A previous version of this document contained a special element for
<a href="http://anubis.dkuug.dk/jtc1/sc22/wg20/docs/n897-14652w25.pdf">ISO TR 14652</a>
compatibility data. That element has been withdrawn, pending further investigation,
since</span><b><i> </i></b>14652 is a Type 1 TR: "when the required support cannot be obtained for the
publication of an International Standard, despite repeated effort". See the ballot comments on
<a href="http://anubis.dkuug.dk/jtc1/sc22/wg20/docs/n948-J1N6769-14652.pdf">14652 Comments</a>
for details on the 14652 defects. For example, most of these patterns make little provision for
substantial changes in format when elements are empty, so are not particularly useful in
practice. Compare, for example, the mail-merge capabilities of production software such as
Microsoft Word or OpenOffice.</p>
<p><span><b>Note: </b>While the CLDR specification guarantees backwards compatibility, the
definition of specials is up to other organizations. Any assurance of backwards compatibility is
up to those organizations.</span></p>
</blockquote>
<h3>A.1 <a name="ICU">ICU</a></h3>
<p>There is one main areas where ICU has capabilities that go beyond
what is shown above.</p>
<h4>A.1.1 <a name="<ruleBasedNumberFormat>"><icu:ruleBasedNumberFormat></a></h4>
<p>The rule-based number format (RBNF) encapsulates a set of rules for
mapping binary numbers to and from a readable representation. They are typically used for spelling
out numbers, but can also be used for other number systems like roman numerals, or for ordinal
numbers (1<sup>st</sup>, 2<sup>nd</sup>, 3<sup>rd</sup>,...). The rules are fairly sophisticated;
for details see <i>Rule-Based Number Formatter</i> [<a href="#RBNF">RBNF</a>].</p>
<p class="example">Example:</p>
<pre> <special xmlns:icu="<span style="color: blue">http://ibm.com/software/globalization/icu/</span>">
<icu:ruleBasedNumberFormats>
<icu:ruleBasedNumberFormat type="<span style="color: blue">spellout</span>">
<span style="color: blue"> %%and:
and =%default=;
100: =%default=;
%%commas:
' and =%default=;
100: , =%default=;
1000: ,
</span> </icu:ruleBasedNumberFormat>
<icu:ruleBasedNumberFormat type="<span style="color: blue">ordinal</span>">
<span style="color: blue"> %main:
=#,##0==%%abbrev=;
%%abbrev:
th; st; nd; rd; th;
20: &gt;&gt;;
100: &gt;&gt;;
</span> </icu:ruleBasedNumberFormat>
<icu:ruleBasedNumberFormat type="<span style="color: blue">duration</span>">
<span style="color: blue"> %with-words:
0 seconds; 1 second; =0= seconds;
60/60:
</span> </icu:ruleBasedNumberFormat>
</icu:ruleBasedNumberFormats></pre>
<h4><span class="removedspan">A.1.2 <a name="<boundaries>"><icu:boundaries></a></span></h4>
<p><span class="removedspan">Boundaries provide rules for grapheme-cluster ("user-character"),
word, line, and sentence breaks. This format is the Java/ICU syntax, at the top level. For a
description of that, see <i>Rule-Based Break Iterator</i> [<a href="#RBBI">RBBI</a>]. The
enclosing special element is a sub-element of <ldml>.</span></p>
<pre><span class="removedspan"> <special xmlns:icu="<span style="color: blue">http://ibm.com/software/globalization/icu/</span>">
<icu:boundaries>
<span style="color:green"><!-- Boundary rules.
Selected samples are given with no attempt to make them work.
This format is the Java/ICU syntax, at the top level.
For real data, see http://oss.software.ibm.com/developerworks/opensource/cvs/icu4j
in BreakIteratorRules.java
displayName attributes removed for now
--></span>
<icu:grapheme type="RuleBased" append="<span style="color: blue">true</span>">
<span style="color:green"><!-- in addition to the normal rules, treat CH and RR as graphemes. --></span>
<span style="color: blue"> [cC][hH];[rR][rR]
</span> </icu:grapheme>
<icu:word type="<span style="color: blue">Dictionary</span>" import="<span style="color: blue">thaiDict.dat</span>" >
<span style="color:green"><!-- When doing Thai word break, check the normal word break rules first. --></span>
<span style="color: blue"> digit=[[:Nd:][:No:]];
$digit [[:Pd:]&#xAD;&#x2027;&apos;.]
</span> </icu:word>
</icu:boundaries>
</special></span></pre>
<h4><span class="removedspan">A.1.3 <a name="<transforms>"><icu:transforms></a></span></h4>
<p><span class="removedspan">There may be language-specific transformations, typically used in
locale data for transliterations. Such transformations require far more than a simple list of
matching characters, since the matches are highly context-sensitive. Each such transform is
supplied in a <transform> element. The contents of the transform element is a list of rules, as
described in the ICU documentation for [<a href="#ICUTransforms">ICUTransforms</a>]. The enclosing
special element is a sub-element of <ldml>. The type value is either a script (long or short name)
or a locale id, or a pair separated by "-".</span></p>
<p class="note"><span class="removedspan">Note: there is an on-line demonstration of transforms at
[<a href="#ICUTransforms">ICUTransforms</a>].</span></p>
<p class="example"><span class="removedspan">Example: The following is an abbreviated example for
Greek to Latin and back, in a Greek locale. The target value can be a script ID or a locale ID.</span></p>
<pre><span class="removedspan"><ldml>
...
<special xmlns:icu="<span style="color: blue">http://ibm.com/software/globalization/icu/</span>">
<icu:transforms>
<icu:transform type="<span style="color: blue">Latin</span>">
<span style="color:green"># variables
</span><span style="color: blue"> $gammaLike = [ΓΚΞΧγκξχϰ] ;
</span> <span style="color: green">...</span>
<span style="color: blue">::NFD (NFC) ;</span> <span style="color:green"># convert everything to decomposed for simplicity</span>
<span style="color: green">...</span>
<span style="color: blue">α ↔ a ; Α ↔ A ;
β ↔ v ; Β ↔ V ;
γ } $gammaLike ↔ n } $egammaLike ;</span> <span style="color:green"># contextual transform</span>
<span style="color: blue">Γ } $gammaLike ↔ N } $egammaLike ;</span> <span style="color:green"># contextual transform</span>
<span style="color: blue">γ ↔ g ; Γ ↔ G ;
δ ↔ d ; Δ ↔ D ;
ε ↔ e ; Ε ↔ E ;
ζ ↔ z ; Ζ ↔ Z ;
Θ } $beforeLower ↔ Th ;</span> <span style="color:green"># contextual transform</span>
<span style="color: blue">θ ↔ th ; Θ ↔ TH ;
ι ↔ i ; Ι ↔ I ;
κ ↔ k ; Κ ↔ K ;
λ ↔ l ; Λ ↔ L ;
μ ↔ m ; Μ ↔ M ;
ν } $gammaLike → n\' ;</span> <span style="color:green"># contextual transform</span>
<span style="color: blue">Ν } $gammaLike ↔ N\' ;</span> <span style="color:green"># contextual transform</span>
<span style="color: blue">ν ↔ n ; Ν ↔ N ;</span>
<span style="color: green">...</span>
<span style="color: blue">::NFC (NFD) ;</span> <span style="color:green"># convert back to composed</span>
</icu:transform>
</icu:transforms>
</special></span></pre>
<h3>A.2 <a name="OpenOffice">openoffice.org</a></h3>
<p>A number of the elements above can have extra information for openoffice.org, such as the
following example:</p>
<pre> <special xmlns:openOffice="<span style="color: blue">http://www.openoffice.org</span>">
<openOffice:search>
<openOffice:searchOptions>
<openOffice:transliterationModules><span style="color: blue">IGNORE_CASE</span></openOffice:transliterationModules>
</openOffice:searchOptions>
</openOffice:search>
</special>
</pre>
<h2>Appendix B: <a name="Transmitting_Locale_Information">Transmitting Locale Information</a></h2>
<p>In a world of on-demand software components, with arbitrary connections between those
components, it is important to get a sense of where localization should be done, and how to
transmit enough information so that it can be done at that appropriate place. End-users need to
get messages localized to their languages, messages that not only contain a translation of text,
but also contain variables such as date, time, number formats, and currencies formatted according
to the users' conventions. The strategy for doing the so-called <i>JIT localization </i>is made up
of two parts:</p>
<ol>
<li>Store and transmit <i>neutral-format</i> data wherever possible.
<ul>
<li>Neutral-format data is data that is kept in a standard format, no matter what the local
user's environment is. Neutral-format is also (loosely) called <i>binary data</i>, even though
it actually could be represented in many different ways, including a textual representation
such as in XML. </li>
<li>Such data should use accepted standards where possible, such as for currency codes. </li>
<li>Textual data should also be in a uniform character set (Unicode/10646) to avoid possible
data corruption problems when converting between encodings.</li>
</ul>
</li>
<li>Localize that data as "<i>close</i>" to the end-user as possible.</li>
</ol>
<p>There are a number of advantages to this strategy. The longer the data is kept in a neutral
format, the more flexible the entire system is. On a practical level, if transmitted data is
neutral-format, then it is much easier to manipulate the data, debug the processing of the data,
and maintain the software connections between components.</p>
<p>Once data has been localized into a given language, it can be quite difficult to
programmatically convert that data into another format, if required. This is especially true if
the data contains a mixture of translated text and formatted variables. Once information has been
localized into, say, Romanian, it is much more difficult to localize that data into, say, French.
Parsing is more difficult than formatting, and may run up against different ambiguities in
interpreting text that has been localized, even if the original translated message text is
available (which it may not be).</p>
<p>Moreover, the closer we are to end-user, the more we know about that user's preferred formats.
If we format dates, for example, at the user's machine, then it can easily take into account any
customizations that the user has specified. If the formatting is done elsewhere, either we have to
transmit whatever user customizations are in play, or we only transmit the user's locale code,
which may only approximate the desired format. Thus the closer the localization is to the end
user, the less we need to ship all of the user's preferences arond to all the places that
localization could possibly need to be done.</p>
<p>Even though localization should be done as close to the end-user as possible, there will be
cases where different components need to be aware of whatever settings are appropriate for doing
the localization. Thus information such as a locale code or timezone needs to be communicated
between different components.</p>
<h3><a name="Message_Formatting_and_Exceptions">Message Formatting and Exceptions</a></h3>
<p>Windows (<a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wcesdkr/htm/_wcesdk_win32_FormatMessage.asp">FormatMessage</a>,
<a href="http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemstringclassformattopic1.asp">
String.Format</a>), Java (<a href="http://java.sun.com/j2se/1.4.2/docs/api/java/text/MessageFormat.html">MessageFormat</a>)
and ICU (<a href="http://icu.sourceforge.net/apiref/icu4c/classMessageFormat.html">MessageFormat</a>,
<a href="http://icu.sourceforge.net/apiref/icu4c/umsg_8h.html">umsg</a>) all provide methods of
formatting variables (dates, times, etc) and inserting them at arbitrary positions in a string.
This avoids the manual string concatenation that causes severe problems for localization. The
question is, where to do this? It is especially important since the original code site that
originates a particular message may be far down in the bowels of a component, and passed up to the
top of the component with an exception. So we will take that case as representative of this class
of issues.</p>
<p>There are circumstances where the message can be communicated with a language-neutral code,
such as a numeric error code or mnemonic string key, that is understood outside of the component.
If there are arguments that need to accompany that message, such as a number of files or a
datetime, those need to accompany the numeric code so that when the localization is finally at
some point, the full information can be presented to the end-user. This is the best case for
localization.</p>
<p>More often, the exact messages that could originate from within the component are not known
outside of the component itself; or at least they may not be known by the component that is
finally displaying text to the user. In such a case, the information as to the user's locale needs
to be communicated in some way to the component that is doing the localization. That locale
information does not necessarily need to be communicated deep within the component; ideally, any
exceptions should bundle up some language-neutral message ID, plus the arguments needed to format
the message (e.g. datetime), but not do the localization at the throw site. This approach has the
advantages noted above for JIT localization.</p>
<p>In addition, exceptions are often caught at a higher level; they don't end up being displayed
to any end-user at all. By avoiding the localization at the throw site, it the cost of doing
formatting, when that formatting is not really necessary. In fact, in many running programs most
of the exceptions that are thrown at a low level never end up being presented to an end-user, so
this can have considerable performance benefits.</p>
<h2>Appendix C: <a name="Supplemental_Data">Supplemental Data</a></h2>
<p>The following represents the format for supplemental information. This is information that is
important for proper formatting, but is not contained in the locale hierarchy. It is not
localizable, nor is it overridden by locale data. It uses the following format, where the data
here is solely for illustration:</p>
<pre><supplementalData>
<currencyData>
<fractions>
...
<info iso4217="CHF" digits="2" rounding="5"/>
...
<info iso4217="<span style="color: blue">ITL</span>" digits="<span style="color: blue">0</span>"/>
...
</fractions>
...
<region iso3166="IT">
<currency iso4217="EUR" from="1999-01-01"/>
<currency iso4217="ITL" from="1862-8-24" to="2002-02-28"/>
</region>
...
<region iso3166="CS">
<currency iso4217="EUR" from="2003-02-04"/>
<currency iso4217="CSD" from="2002-05-15"/>
<currency iso4217="YUM" from="1994-01-24" to="2002-05-15"/>
</region>
...
</currencyData>
</supplementalData></pre>
<p><span class="removedspan">The only data currently represented is currency data. </span>Each
currencyData element contains one fractions element followed by one or more region elements. The
fractions element contains any number of info elements, with the following attributes:</p>
<ul>
<li><b>iso4217: </b>the ISO 4217 code for the currency in question. If a particular currency
does not occur in the fractions list, then it is given the defaults listed for the next two
attributes.</li>
<li><b>digits: </b>the number of decimal digits normally formatted. The default is 2.</li>
<li><b>rounding: </b>the rounding increment, in units of 10<sup>-digits</sup>. The default is 1.
Thus with fraction digits of 2 and rounding increment of 5, numeric values are rounded to the
nearest 0.05 units in formatting. With fraction digits of 0 and rounding increment of 50,
numeric values are rounded to the nearest 50.</li>
</ul>
<p>Each region element contains one attribute:</p>
<ul>
<li><b>iso3166:</b> the ISO 3166 code for the region in question. The special value <i>XXX</i>
can be used to indicate that the region has no valid currency or that the circumstances are
unknown (usually used in conjunction with <i>before</i>, as described below).</li>
</ul>
<p>And can have any number of currency elements, with the ordered subelements.</p>
<pre><span> <region iso3166="IT"> <!-- Italy -->
<currency iso4217="EUR" from="2002-01-01"/>
<currency iso4217="ITL" to="2001-12-31"/>
</region></span></pre>
<ul>
<li><b>iso4217: </b>the ISO 4217 code for the currency in question</li>
<li><b>from: </b>the currency was valid from to the datetime indicated by the value. The
datetime format is either <span class="attributeValue">1999-05-31</span>, or
<span class="attributeValue">1999-05-31T13:20:00</span> if the time is
necessary.</li>
<li><b>to: </b>the currency was valid up to the datetime indicated by the value of <i>before</i>.
The datetime format is the same as in the <span class="attribute">to</span>
attribute.</li>
</ul>
<p><span>That is, each currency element will list an interval in which it was valid. The<i>ordering</i>
of the elements in the list tells us which was the primary currency during any period in time.
Here is an example of such an overlap:</span></p>
<pre><span><currency iso4217="CSD" to="<span class="changedspan">2002-05-15</span>"/>
<currency iso4217="YUD" from="1994-01-24" to="2002-05-15"/>
<currency iso4217="YUN" from="1994-01-01" to="1994-07-22"/></span></pre>
<p><span>If the <i>from</i> element is missing, it is assumed to be as far backwards in time as we
have data for; if the <i>to</i> element is missing, then it is from this point onwards. The <i>
from</i> element is also limited by the fact that ISO 4217 does not go very far back in time, so
there may be no ISO code for the previous currency.</span></p>
<p><span><languageData></span></p>
<p><span>The following is used for consistently checking and testing. The coverage will improve
over time. At this point, the territories and scripts are limited to those that are official
languages of the region as a whole, or are major commercial languages.</span></p>
<pre><span> <languageData>
<language type="af" scripts="Latn" territories="ZA"/>
<language type="am" scripts="Ethi" territories="ET"/>
<language type="ar" scripts="Arab" territories="AE BH DZ EG IN IQ JO KW LB
LY MA OM PS QA SA SD SY TN YE"/></span>
<span> ...</span></pre>
<p><span class="changedspan">This element can also be used to indicate secondary languages and/or scripts used in a territory.</span></p>
<pre>
<span class="changedspan"><language type="fr" scripts="Latn" territories="IT US" alt="secondary" /></span>
<span class="changedspan">...</span></pre>
<p><span><timezoneData></span></p>
<p><span>The following is data that can be used to get a single timezone id from a set of modern
equivalents</span></p>
<pre><span><timezoneData>
<size ordering="America/New_York America/Detroit America/Louisville
America/Kentucky/Monticello">
...</span></pre>
<p><span class="changedspan">The following subelement of <span><timezoneData> </span>supplies
information used by Appendix J: <a href="#Time_Zone_Fallback"><span>Time Zone Display Names</span></a>.</span></p>
<pre><span class="changedspan"><zoneFormatting multizone="001 AQ AR AU BR CA CD CL CN EC ES FM GB GL ID KI KZ MH ML MN MX MY NZ PF PT RU SJ UA UM US UZ">
<zoneItem type="Africa/Abidjan" territory="CI"/>
<zoneItem type="Africa/Accra" territory="GH"/>
<zoneItem type="America/Adak" territory="US" aliases="America/Atka US/Aleutian"/>
<zoneItem type="Africa/Addis_Ababa" territory="ET"/>
<zoneItem type="Australia/Adelaide" territory="AU" aliases="Australia/South"/></span></pre>
<p><span class="changedspan">The multizone attribute lists the territories that that have multiple
TZIDs, which is used in step #5 of Appendix J: <a href="#Time_Zone_Fallback"><span>Time Zone
Display Names</span></a>. The zoneItem type is the canonical ID for CLDR. The aliases map to that
canonical ID; this is used in step #1 in Appendix J: <a href="#Time_Zone_Fallback"><span>Time Zone
Display Names</span></a>. The territory is also used in step #5.</span></p>
<h4>T<span>erritory Containment</span></h4>
<p><span>The following data provides information that allows GUIs to break up a very long list of
country names into a progressive list. The data is based on the information found at [<a href="#UNM49">UNM49</a>].
There is one special code, QO, which is used for outlying areas that are typically uninhabited.</span></p>
<pre><span><territoryContainment></span></pre>
<blockquote>
<blockquote>
<pre><span><group type="001" contains="002 009 019 142 150"/> <!--World -->
<group type="011" contains="BF BJ CI CV GH GM GN GW LR ML MR NE NG SH SL SN TG"/> <!--Western Africa -->
<group type="013" contains="BZ CR GT HN MX NI PA SV"/> <!--Central America -->
<group type="014" contains="BI DJ ER ET KE KM MG MU MW MZ RE RW SC SO TZ UG YT ZM ZW"/> <!--Eastern Africa -->
<group type="142" contains="030 035 062 145"/> <!--Asia -->
<group type="145" contains="AE AM AZ BH CY GE IL IQ JO KW LB OM PS QA SA SY TR YE"/> <!--Western Asia -->
<group type="015" contains="DZ EG EH LY MA SD TN"/> <!--Northern Africa -->
...</span></pre>
</blockquote>
</blockquote>
<p><span><mapTimezones></span></p>
<p><span>The following data can be used to provide mappings between </span><i>
<span class="changedspan">TZ</span></i><span> IDs and other platforms. The purpose is to assist
with migration and vetting.</span></p>
<pre><span><mapTimezones type="windows">
<mapZone other="Dateline" type="Etc/GMT+12">
<mapZone other="Samoa" type="Pacific/Midway">
<mapZone other="Hawaiian" type="Pacific/Honolulu"></span></pre>
<pre><span>...</span></pre>
<p><span><alias></span></p>
<p><span>This element provides information as to parts of locale IDs that should be substituted
when accessing CLDR data. This logical substitution should be done to both the locale id, and to
any lookup for display names of languages, territories, etc. As with the display names, the
language type and replacement may be any prefix of a valid locale id, such as "<span class="changedspan">no_NO</span>".</span></p>
<pre><span><alias>
<language type="in" replacement="id">
<language type="sh" replacement="sr">
<language type="sh_YU" replacement="sr_Latn_YU">
...
<territory type="BU" replacement="MM">
...
</alias></span></pre>
<pre><span class="dtd"><!ELEMENT deprecated ( deprecatedItems* ) >
<!ATTLIST deprecated draft ( true | false ) #IMPLIED >
<!ELEMENT deprecatedItems EMPTY >
<!ATTLIST deprecatedItems draft ( true | false ) #IMPLIED >
<!ATTLIST deprecatedItems type ( standard | supplemental ) #IMPLIED >
<!ATTLIST deprecatedItems elements NMTOKENS #IMPLIED >
<!ATTLIST deprecatedItems attributes NMTOKENS #IMPLIED >
<!ATTLIST deprecatedItems values CDATA #IMPLIED ></span></pre>
<p><span>The deprecated items can be used to indicate elements, attributes, and attribute values
<font face="Lucida Sans Unicode">that are deprecated. This means that the items are valid, but
that their usage is strongly discouraged. When the same deprecatedItems element contains
combinations of elements, attributes, and values, then the "least significant" items are only
deprecated if they occur with the "more significant" items. For example:</font></span></p>
<table border="1" cellpadding="0" cellspacing="1">
<caption><span>Deprecated Items</span></caption>
<tr>
<td width="50%"><span><span><font face="Lucida Sans Unicode"><code><deprecatedItems
elements="A B"></code></font></span></span></td>
<td width="50%"><span><span><font face="Lucida Sans Unicode">A and B are deprecated</font></span></span></td>
</tr>
<tr>
<td width="50%"><span><span><font face="Lucida Sans Unicode"><code><deprecatedItems
attributes="C D"></code></font></span></span></td>
<td width="50%"><span><span><font face="Lucida Sans Unicode">C and D are deprecated on all
elements</font></span></span></td>
</tr>
<tr>
<td width="50%"><span><span><font face="Lucida Sans Unicode"><code><deprecatedItems
elements="A B" attributes="C D"></code></font></span></span></td>
<td width="50%"><span><span><font face="Lucida Sans Unicode">C and D are deprecated, but only
if they occur on elements A or B.</font></span></span></td>
</tr>
<tr>
<td width="50%"><span><span><font face="Lucida Sans Unicode"><code><deprecatedItems
elements="A B" attributes="C D" values="E"></code></font></span></span></td>
<td width="50%"><span><span><font face="Lucida Sans Unicode">E is deprecated, but only if it
is a value of C in an element A or B</font></span></span></td>
</tr>
</table>
<p><span>In each case, multiple items are space-delimited.</span></p>
<h5><span><characters></span></h5>
<p><span>The characters element provides a way for non-Unicode systems, or systems that only
support a subset of Unicode characters, to transform CLDR data. It gives a list of characters with
alternative values that can be used if the main value is not available. For example:</span></p>
<pre><span><characters>
<<span class="changedspan">character-</span>fallback>
<character value = "ß">
<<span class="changedspan">substitute</span>>ss</<span class="changedspan">substitute</span>>
</character>
<character value = "Ø">
<<span class="changedspan">substitute</span>>Ö</<span class="changedspan">substitute</span>>
<<span class="changedspan">substitute</span>>O</<span class="changedspan">substitute</span>>
</character>
<character value = "<span style="font-size:150%">₧</span>">
<<span class="changedspan">substitute</span>>Pts</<span class="changedspan">substitute</span>>
</character>
<character value = "<span style="font-size:150%">₣</span>">
<<span class="changedspan">substitute</span>>Fr.</<span class="changedspan">substitute</span>>
</character>
</<span class="changedspan">character-</span>fallback>
</characters></span></pre>
<p><span>The ordering of the <span class="changedspan">substitute</span> elements indicates the preference among them.</span></p>
<h5><span class="changedspan">Calendar Data</span></h5>
<pre><span class="changed"><calendarData>
<!-- gregorian is assumed, so these are all in addition -->
<calendar type="japanese" territories="JP"/>
<calendar type="islamic-civil" territories="AE BH DJ DZ EG EH ER IL IQ JO KM KW
LB LY MA MR OM PS QA SA SD SY TD TN YE AF IR"/>
...</span></pre>
<p><span class="changedspan">The common values provide a list of the calendars that are in common
use, and thus should be shown in UIs that provide choice of calendars. (An 'Other...' button could
give access to the other available calendars.)</span></p>
<pre><span class="changed"><weekData>
<minDays count="1" territories="001"/>
<minDays count="4" territories="AT BE CA CH DE DK FI FR IT LI LT LU MC MT NL NO SE SK"/>
<minDays count="4" territories="CD" draft="true"/>
<firstDay day="mon" territories="001"/>
...</span></pre>
<p class="note"><span class="changedspan">These values provide information
on how a calendar is used in a particular territory. It may also be used in computing week
boundaries for other purposes. The default is provided by the element with
territories="001". </span></p>
<p class="note"><span class="changedspan">The minDays indicates the minimum
number of days to count as the first week (of a month or year). The first day of the week is typically used for calendar
presentation. </span></p>
<p><span class="changedspan"><span>What is meant by the weekend varies from country to country. It
is typically when most non-retail businesses are closed. The time should not be specified unless
it is a well-recognized part of the day.</span></span></p>
<p class="note"><span class="changedspan">The weekendStart day defaults to "sat", and weekendEnd
day defaults to "sun".</span></p>
<p class="note"><span class="changedspan">The weekendStart time defaults to "00:00:00" (midnight
at the start of the day). The weekendEnd time defaults to "24:00:00" (midnight at the end of the
day). <span>(That is, Friday at 24:00:00 is the same time as Saturday at 00:00:00.) Thus the
following are equivalent:</span></span></p>
<table style="margin-top: 0.5li; margin-bottom: 0.5li">
<tr>
<td><span class="changedspan"><span><weekendStart day="<span style="color: blue">sat</span>"/><br>
<weekendEnd day="<span style="color: blue">sun</span>"/></span></span></td>
</tr>
<tr>
<td><span class="changedspan"><span><weekendStart day="<span style="color: blue">sat</span>"
time="<span style="color: blue">00:00</span>"/><br>
<weekendEnd day="<span style="color: blue">sun</span>" time="<span style="color: blue">24:00</span>"/></span></span></td>
</tr>
<tr>
<td><span class="changedspan"><span><weekendStart day="<span style="color: blue">fri</span>"
time="<span style="color: blue">24:00</span>"/><br>
<weekendEnd day="<span style="color: blue">mon</span>" time="<span style="color: blue">00:00</span>"/></span></span></td>
</tr>
</table>
<p><span class="changedspan">The week information was formerly in the main LDML file.</span></p>
<h3><span class="changedspan">Measurement System</span></h3>
<pre><span class="changed"><measurementData>
<measurementSystem type="metric" territories="001"/>
<measurementSystem type="US" territories="US"/>
<paperSize type="A4" territories="001"/>
<paperSize type="US-Letter" territories="US"/>
</measurementData></span></pre>
<p><span class="changedspan">The measurement system is the normal measurement system in common
everyday use (except for date/time). The values are "metric" (= ISO 1000), "US", or "UK"; others
may be added over time. <span>The "US" value indicates the customary system of measurement with
feet, inches, pints, quarts, etc. as used in the United States. The "UK" value indicates the
customary system of measurement with feet, inches, pints, quarts, etc. as used in the United
Kingdom. It is also called the Imperial system: the pint, quart, etc. are different sizes than in
"US".</span></span></p>
<p><span class="changedspan">The paperSize attribute gives <span>the height and width of paper
used for </span>normal business letters. The values are <span>A4 and US.</span></span></p>
<p><span class="changedspan">The measurement information was formerly in the main LDML file,
and had a somewhat different format.<br>
</span></p>
<h2>Appendix D: <a name="Language_and_Locale_IDs">Language and Locale IDs</a></h2>
<p>People have very slippery notions of what distinguishes a language code vs. a locale code. The
problem is that both are somewhat nebulous concepts.</p>
<p>In practice, many people use [<a href="#RFC3066">RFC3066</a>] codes to mean locale codes
instead of strictly language codes. It is easy to see why this came about; because [<a href="#RFC3066">RFC3066</a>]
includes an explicit region (territory) code, for most people it was sufficient for use as a
locale code as well. For example, when typical web software receives an [<a href="#RFC3066">RFC3066</a>]
code, it will use it as a locale code. Other typical software will do the same: in practice,
language codes and locale codes are treated interchangeably. Some people recommend distinguishing
on the basis of "-" vs "_" (e.g. <i>zh-TW</i> for language code, <i>zh_TW</i> for locale code),
but in practice that does not work because of the free variation out in the world in the use of
these separators. Notice that Windows, for example, uses "-" as a separator in its locale codes.
So pragmatically one is forced to treat "-" and "_" as equivalent when interpreting either one on
input.</p>
<p>Another reason for the conflation of these codes is that <i>very</i> little data in most
systems is distinguished by region alone; currency codes and measurement systems being some of the
few. Sometimes date or number formats are mentioned as regional, but that really doesn't make much
sense. If people see the sentence "You will have to adjust the value to १,२३४.५६७ from ૭૧,૨૩૪.૫૬"
(using Indic digits), they would say that sentence is simply not English. Number format is far
more closely associated with language than it is with region. The same is true for date formats:
people would never expect to see intermixed a date in the format "2003年4月1日" (using Kanji) in text
purporting to be purely English. There are regional differences in date and number format —
differences which can be important — but those are different in kind than other language
differences between regions.</p>
<p>As far as we are concerned — <i>as a completely practical matter</i> — two languages are
different if they require substantially different localized resources. Distinctions according to
spoken form are important in some contexts, but the written form is by far and away the most
important issue for data interchange. Unfortunately, this is not the principle used in [<a href="#ISO639">ISO639</a>],
which has the fairly unproductive notion (for data interchange) that only spoken language matters
(it is also not completely consistent about this, however).</p>
<p>[<a href="#RFC3066">RFC3066</a>] <i><b>can</b></i> express a difference if the use of written
languages happens to correspond to region boundaries expressed as [<a href="#ISO3166">ISO3166</a>]
region codes, and has recently added codes that allow it to express some important cases that are
not distinguished by [<a href="#ISO3166">ISO3166</a>] codes. These written
languages include simplified and
traditional Chinese (both used in Hong Kong S.A.R.); Serbian in Latin script; <span class="changedspan">Azerbaijani</span>
in Arab script<span class="changed">, and so on.</span></p>
<p>Notice also that <i>currency codes</i> are different than <i>currency localizations</i>. The
currency localizations should <span class="changedspan">largely</span> be in the language-based resource bundles, not in the
territory-based resource bundles. Thus, the resource bundle <i>en</i> contains the localized
mappings in English for a range of different currency codes: USD → <span class="changedspan">US$</span>, RUR → Rub<span class="changedspan">, AUD → $A</span> etc. <span class="changedspan">Of course, some currency symbols are used for more than one currency, and in such cases specializations appear in the territory-based bundles. Continuing the example, <i>en_US</i> would have USD → $, while <i>en_AU</i> would have AUD → $.</span> (In
protocols, the currency codes should always accompany any currency amounts; otherwise the data is
ambiguous, and software is forced to use the user's territory to guess at the currency. For some
informal discussion of this, see
<a href="http://dev.icu-project.org/cgi-bin/viewcvs.cgi/*checkout*/icuhtml/design/jit_localization.html?rev=HEAD&content-type=text/html">JIT
Localization</a>.)</p>
<h3><a name="Written_Language">Written Language</a></h3>
<p>Criteria for what makes a written language should be purely pragmatic; <i>what would
copy-editors say? </i>If one gave them text like the following, they would respond that is far
from acceptable English for publication, and ask for it to be redone:</p>
<ol>
<li type="A">"Theatre Center News: The date of the last version of this document was 2003年3月20日.
A copy can be obtained for $50,0 or 1.234,57 грн. We would like to acknowledge contributions by
the following authors (in alphabetical order): Alaa Ghoneim, Behdad Esfahbod, Ahmed Talaat, Eric
Mader, Asmus Freytag, Avery Bishop, and Doug Felt."</li>
</ol>
<p>So one would change it to either B or C below, depending on which orthographic variant of
English was the target for the publication:</p>
<ol type="A" start="2">
<li>"Theater Center News: The date of the last version of this document was 3/20/2003. A copy
can be obtained for $50.00 or 1,234.57 Ukrainian Hryvni. We would like to acknowledge
contributions by the following authors (in alphabetical order): Alaa Ghoneim, Ahmed Talaat,
Asmus Freytag, Avery Bishop, Behdad Esfahbod, Doug Felt, Eric Mader."</li>
<li>"Theatre Centre News: The date of the last version of this document was 20/3/2003. A copy
can be obtained for $50.00 or 1,234.57 Ukrainian Hryvni. We would like to acknowledge
contributions by the following authors (in alphabetical order): Alaa Ghoneim, Ahmed Talaat,
Asmus Freytag, Avery Bishop, Behdad Esfahbod, Doug Felt, Eric Mader."</li>
</ol>
<p>Clearly there are many acceptable variations on this text. For example, copy editors might
still quibble with the use of first vs. last name sorting in the list, but clearly the first list
was <i>not</i> acceptable English alphabetical order. And in quoting a name, like "Theatre Centre
News", one may leave it in the source orthography even if it differs from the publication target
orthography. And so on. However, just as clearly, there limits on what is acceptable English, and
"2003年3月20日", for example, is <i>not</i>.</p>
<p><span class="changedspan">Note that the language of locale data may differ from the language of localized software or web sites, when those latter are not localized into the user's preferred language. In such cases, the kind of incongruous juxtapositions described above may well appear, but this situation is usually preferable to forcing unfamiliar date or number formats on the user as well.</span></p>
<h2>Appendix E: <a name="Unicode_Sets">Unicode Sets</a></h2>
<p>A UnicodeSet is a set of Unicode characters (and possibly strings) determined by a pattern,
following <i>UTS #18: Unicode Regular Expressions</i> [<a href="#URegex">URegex</a>]. For an
example of a concrete implementation of this, see [<a href="#ICUUnicodeSet">ICUUnicodeSet</a>].</p>
<p>Patterns are a series of characters bounded by square brackets that contain lists of characters
and Unicode property sets. Lists are a sequence of characters that may have ranges indicated by a
'-' between two characters, as in "a-z". The sequence specifies the range of all characters from
the left to the right, in Unicode order. For example, <b>[a c d-f m]</b> is equivalent to <b>[a c
d e f m]</b>. Whitespace can be freely used for clarity, as <b>[a c d-f m]</b> means the same as
<b>[acd-fm]</b>.</p>
<p>Unicode property sets are specified by any Unicode property, such as [:Letter:], using the
PropertyAlias file and the PropertyValueAlias file. The syntax for specifying the property names
is an extension of either POSIX or Perl syntax with the addition of "=value". For example, you can
match letters by using the POSIX syntax <b>[:Letter:]</b>, or by using the Perl-style syntax <b>\p{Letter}</b>.
The type can be omitted for the Category and Script properties, but is required for other
properties.</p>
<p>The table below shows the two kinds of syntax: POSIX and Perl style. Also, the table shows the
"Negative", which is a property that excludes all characters of a given kind. For example, <b>
[:^Letter:]</b> matches all characters that are not <b>[:Letter:]</b>.</p>
<table>
<tr>
<th> </th>
<th>Positive </th>
<th>Negative </th>
</tr>
<tr>
<td>POSIX-style Syntax </td>
<td>[:type=value:] </td>
<td>[:^type=value:] </td>
</tr>
<tr>
<td>Perl-style Syntax </td>
<td>\p{type=value} </td>
<td>\P{type=value} </td>
</tr>
</table>
<p>These following low-level lists or properties then can be freely combined with the normal set
operations (union, inverse, difference, and intersection):</p>
<ul>
<li>To union two sets, simply concatenate them. For example, <b>[[:letter:] [:number:]]</b> </li>
<li>To intersect two sets, use the '&' operator. For example, <b>[[:letter:] & [a-z]] </b></li>
<li>To take the set-difference of two sets, use the '-' operator. For example, <b>[[:letter:] -
[a-z]]</b> </li>
<li>To invert a set, place a '^' immediately after the opening '['. For example, <b>[^a-z]</b>.
In any other location, the '^' does not have a special meaning.</li>
</ul>
<p>The binary operators '&' and '-' have equal precedence and bind left-to-right. Thus <b>
[[:letter:]-[a-z]-[\u0100-\u01FF]]</b> is equivalent to <b>[[[:letter:]-[a-z]]-[\u0100-\u01FF]]</b>.
Another example is the set <b>[[ace][bdf] - [abc][def]]</b>, which is not the empty set, but
instead the set <b>[def]</b>.</p>
<p>Another caveat with the '&' and '-' operators is that they operate between sets. That is, they
must be immediately preceded and immediately followed by a set. For example, the pattern <b>
[[:Lu:]-A]</b> is illegal, since it is interpreted as the set <b>[:Lu:]</b> followed by the
incomplete range <b>-A</b>. To specify the set of uppercase letters except for 'A', enclose the
'A' in a set: <b>[[:Lu:]-[A]]</b>. A multicharacter string can be in a Unicode set, to represent a
tailored grapheme cluster for a particular language. The syntax uses curly braces for that case.</p>
<table>
<tr>
<td>[a] </td>
<td>The set containing 'a' </td>
</tr>
<tr>
<td>[a-z] </td>
<td>The set containing 'a' through 'z' and all letters in between, in Unicode order </td>
</tr>
<tr>
<td>[^a-z] </td>
<td>The set containing all characters but 'a' through 'z', that is, U+0000 through 'a'-1 and
'z'+1 through U+FFFF </td>
</tr>
<tr>
<td>[[pat1][pat2]] </td>
<td>The union of sets specified by pat1 and pat2 </td>
</tr>
<tr>
<td>[[pat1]&[pat2]] </td>
<td>The intersection of sets specified by pat1 and pat2 </td>
</tr>
<tr>
<td>[[pat1]-[pat2]] </td>
<td>The asymmetric difference of sets specified by pat1 and pat2 </td>
</tr>
<tr>
<td>[a{ab}{ac}]</td>
<td>The character 'a' and the multicharacter strings "ab" and "ac"</td>
</tr>
<tr>
<td>[:Lu:] </td>
<td>The set of characters belonging to the given Unicode category, as defined by
Character.getType(); in this case, Unicode uppercase letters. The long form for this is <b>[:UppercaseLetter:]</b>. </td>
</tr>
<tr>
<td>[:L:] </td>
<td>The set of characters belonging to all Unicode categories starting with 'L', that is, <b>
[[:Lu:][:Ll:][:Lt:][:Lm:][:Lo:]]</b>. The long form for this is <b>[:Letter:]</b>. </td>
</tr>
</table>
<p>In Unicode Sets, there are two ways to quote syntax characters and whitespace:</p>
<h5>Single Quote</h5>
<p>Two single quotes represents a single quote, either inside or outside single quotes. Text
within single quotes is not interpreted in any way (except for two adjacent single quotes). It is
taken as literal text (special characters become non-special).</p>
<h5>Backslash Escapes</h5>
<p>Outside of single quotes, certain backslashed characters have special meaning:</p>
<table>
<tr>
<td>\uhhhh </td>
<td>Exactly 4 hex digits; h in [0-9A-Fa-f] </td>
</tr>
<tr>
<td>\Uhhhhhhhh </td>
<td>Exactly 8 hex digits </td>
</tr>
<tr>
<td>\xhh </td>
<td>1-2 hex digits </td>
</tr>
<tr>
<td>\ooo </td>
<td>1-3 octal digits; o in [0-7] </td>
</tr>
<tr>
<td>\a </td>
<td>U+0007 (BELL) </td>
</tr>
<tr>
<td>\b </td>
<td>U+0008 (BACKSPACE) </td>
</tr>
<tr>
<td>\t </td>
<td>U+0009 (HORIZONTAL TAB) </td>
</tr>
<tr>
<td>\n </td>
<td>U+000A (LINE FEED) </td>
</tr>
<tr>
<td>\v </td>
<td>U+000B (VERTICAL TAB) </td>
</tr>
<tr>
<td>\f </td>
<td>U+000C (FORM FEED) </td>
</tr>
<tr>
<td>\r </td>
<td>U+000D (CARRIAGE RETURN) </td>
</tr>
<tr>
<td>\\ </td>
<td>U+005C (BACKSLASH) </td>
</tr>
<tr>
<td>\N{name}</td>
<td>The Unicode character named "name".</td>
</tr>
</table>
<p>Anything else following a backslash is mapped to itself, except in an environment where it is
defined to have some special meaning. For example, <b>\p{uppercase}</b> is the set of uppercase
letters in Unicode.</p>
<p>Any character formed as the result of a backslash escape loses any special meaning and is
treated as a literal. In particular, note that \u and \U escapes create literal characters. (In
contrast, Java treats Unicode escapes as just a way to represent arbitrary characters in an ASCII
source file, and any resulting characters are <i><b>not</b></i> tagged as literals.)</p>
<h2><span>Appendix F: <a name="Date_Format_Patterns">Date Format Patterns</a></span></h2>
<p><span>A date pattern is a string of characters, where specific strings of characters are
replaced with date and time data from a calendar when formatting or used to generate data for a
calendar when parsing. The following are the characters used in patterns to show the appropriate
formats for a given locale. The following are examples:</span></p>
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse">
<tr>
<th width="50%"><span>Pattern</span></th>
<th width="50%"><span>Result (in a particular locale)</span></th>
</tr>
<tr>
<td width="50%"><span>yyyy.MM.dd G 'at' HH:mm:ss zzz</span></td>
<td width="50%"><span>1996.07.10 AD at 15:08:56 PDT</span></td>
</tr>
<tr>
<td width="50%"><span>EEE, MMM d, ''yy</span></td>
<td width="50%"><span>Wed, July 10, '96</span></td>
</tr>
<tr>
<td width="50%"><span>h:mm a</span></td>
<td width="50%"><span>12:08 PM</span></td>
</tr>
<tr>
<td width="50%"><span>hh 'o''clock' a, zzzz</span></td>
<td width="50%"><span>12 o'clock PM, Pacific Daylight Time</span></td>
</tr>
<tr>
<td width="50%"><span>K:mm a, z</span></td>
<td width="50%"><span>0:00 PM, PST</span></td>
</tr>
<tr>
<td width="50%"><span>yyyyy.MMMM.dd GGG hh:mm aaa</span></td>
<td width="50%"><span>01996.July.10 AD 12:08 PM</span></td>
</tr>
</table>
<p><span>Characters may be used multiple times. For example, if y is used for the year, '</span><span>yy</span><span>'
might produce '99', whereas '</span><span>yyyy</span><span>' produces '1999'. For most numerical
fields, the number of characters specifies the field width. For example, if h is the hour, 'h'
might produce '5', but '</span><span>hh</span><span>' produces '05'. For some characters, the
count specifies whether an abbreviated or full form should be used, but may have other choices, as
given below.</span></p>
<blockquote>
<p><span>Note: the counter-intuitive use of 5 letters for the narrow form of weekdays and months
is forced by backwards compatibility.</span></p>
</blockquote>
<table cellSpacing="0" cellPadding="2" border="1">
<caption><a name="Date_Field_Symbol_Table">Date Field Symbol Table</a></caption>
<tr>
<th>Field</th>
<th style="text-align: center">Sym.</th>
<th style="text-align: center"><span>No.</span></th>
<th><span>Example</span></th>
<th><span>Description</span></th>
</tr>
<tr>
<th rowspan="3" style="vertical-align: top; text-align: left">era</th>
<td style="text-align: center; vertical-align: top" rowspan="3">G</td>
<td style="text-align: center; vertical-align: top">1<span>..3</span></td>
<td style="vertical-align: top; text-align: left">AD</td>
<td rowspan="3" style="vertical-align: top; text-align: left"><span>Era - Replaced with the
Era string for the current date. <span class="changed">One to three letters for the
abbreviated form, four letters for the long form, five for the narrow form.</span></span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed">4</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">Anno Domini</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed">5</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">A</span></td>
</tr>
<tr>
<th rowspan="3">year</th>
<td style="text-align: center">y</td>
<td style="text-align: center">1..n</td>
<td>1996</td>
<td>Year. <span>Normally the length specifies the padding, but for two letters it also
specifies the maximum length. Example:<br>
</span><div align="center">
<center>
<table border="0" cellpadding="2" cellspacing="0">
<tr>
<th>Year</th>
<th style="text-align: right">y</th>
<th style="text-align: right">yy</th>
<th style="text-align: right">yyy</th>
<th style="text-align: right">yyyy</th>
<th style="text-align: right">yyyyy</th>
</tr>
<tr>
<td>AD 1</td>
<td style="text-align: right">1</td>
<td style="text-align: right">01</td>
<td style="text-align: right">001</td>
<td style="text-align: right">0001</td>
<td style="text-align: right">00001</td>
</tr>
<tr>
<td>AD 12</td>
<td style="text-align: right">12</td>
<td style="text-align: right">12</td>
<td style="text-align: right">012</td>
<td style="text-align: right">0012</td>
<td style="text-align: right">00012</td>
</tr>
<tr>
<td>AD 123</td>
<td style="text-align: right">123</td>
<td style="text-align: right">23</td>
<td style="text-align: right">123</td>
<td style="text-align: right">0123</td>
<td style="text-align: right">00123</td>
</tr>
<tr>
<td>AD 1234</td>
<td style="text-align: right">1234</td>
<td style="text-align: right">34</td>
<td style="text-align: right">1234</td>
<td style="text-align: right">1234</td>
<td style="text-align: right">01234</td>
</tr>
<tr>
<td>AD 12345</td>
<td style="text-align: right">12345</td>
<td style="text-align: right">45</td>
<td style="text-align: right">12345</td>
<td style="text-align: right">12345</td>
<td style="text-align: right">12345</td>
</tr>
</table>
</center>
</div>
</td>
</tr>
<tr>
<td style="text-align: center">Y</td>
<td style="text-align: center">1..n</td>
<td>1997</td>
<td><span>Year (of "Week of Year"), used in ISO year-week calendar. May differ from calendar
year.</span></td>
</tr>
<tr>
<td style="text-align: center">u</td>
<td style="text-align: center">1..n</td>
<td>4601</td>
<td>Extended year. This is a single number designating the year of this calendar system,
encompassing all supra-year fields. For example, for the Julian calendar system, year numbers
are positive, with an era of BCE or CE. An extended year value for the Julian calendar system
assigns positive values to CE years and negative values to BCE years, with 1 BCE being year 0.</td>
</tr>
<tr>
<th rowspan="6" style="vertical-align: top; text-align: left"><span class="changedspan">
quarter</span></th>
<td style="text-align: center; vertical-align: top" rowspan="3"><span class="changedspan">Q</span></td>
<td style="text-align: center; vertical-align: top"><span class="changedspan">1..2</span></td>
<td style="vertical-align: top; text-align: left"><span class="changedspan">02</span></td>
<td rowspan="3" style="vertical-align: top; text-align: left"><span class="changedspan">
Quarter - Use one or two for the numerical quarter, three for the abbreviation, or four for
the full name.</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changedspan">3</span></td>
<td style="vertical-align: top; text-align: left"><span class="changedspan">Q2</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changedspan">4</span></td>
<td style="vertical-align: top; text-align: left"><span class="changedspan">2nd quarter</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top" rowspan="3"><span class="changedspan">q</span></td>
<td style="text-align: center; vertical-align: top"><span class="changedspan">1..2</span></td>
<td style="vertical-align: top; text-align: left"><span class="changedspan">02</span></td>
<td rowspan="3" style="vertical-align: top; text-align: left"><span class="changedspan"><b>
Stand-Alone</b> Quarter - Use one or two for the numerical quarter, three for the
abbreviation, or four for the full name.</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changedspan">3</span></td>
<td style="vertical-align: top; text-align: left"><span class="changedspan">Q2</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changedspan">4</span></td>
<td style="vertical-align: top; text-align: left"><span class="changedspan">2nd quarter</span></td>
</tr>
<tr>
<th rowspan="8" style="vertical-align: top; text-align: left">month</th>
<td rowspan="4" style="text-align: center; vertical-align: top">M</td>
<td style="text-align: center; vertical-align: top">1..2</td>
<td style="vertical-align: top; text-align: left">09</td>
<td rowspan="4" style="vertical-align: top; text-align: left"><span>Month - Use one or two for
the numerical month, three for the abbreviation, or four for the full name, or five for the
narrow name.</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top">3</td>
<td style="vertical-align: top; text-align: left">Sept</td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top">4</td>
<td style="vertical-align: top; text-align: left">September</td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top">5</td>
<td style="vertical-align: top; text-align: left">S</td>
</tr>
<tr>
<td rowspan="4" style="text-align: center; vertical-align: top"><span class="changed">L</span></td>
<td style="text-align: center; vertical-align: top"><span class="changed">1..2</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">09</span></td>
<td rowspan="4" style="vertical-align: top; text-align: left"><span class="changed"><span><b>
Stand-Alone</b> Month - Use one or two for the numerical month, three for the abbreviation, or
four for the full name, or 5 for the narrow name.</span></span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed">3</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">Sept</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed">4</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">September</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed">5</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">S</span></td>
</tr>
<tr>
<th rowspan="2">week</th>
<td style="text-align: center">w</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>27</span></td>
<td><span>Week of Year.</span></td>
</tr>
<tr>
<td style="text-align: center">W</td>
<td style="text-align: center"><span>1</span></td>
<td><span>3</span></td>
<td><span>Week of Month</span></td>
</tr>
<tr>
<th rowspan="4">day</th>
<td style="text-align: center">d</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>1</span></td>
<td><span>Date - Day of the month</span></td>
</tr>
<tr>
<td style="text-align: center">D</td>
<td style="text-align: center"><span>1..3</span></td>
<td><span>345</span></td>
<td><span>Day of year</span></td>
</tr>
<tr>
<td style="text-align: center">F</td>
<td style="text-align: center">1</td>
<td>2<br>
</td>
<td><span>Day of Week in Month. The example is for the 2nd Wed in July</span></td>
</tr>
<tr>
<td style="text-align: center">g</td>
<td style="text-align: center">1..n</td>
<td>2451334</td>
<td>Modified Julian day. This is different from the conventional Julian day number in two
regards. First, it demarcates days at local zone midnight, rather than noon GMT. Second, it is
a local number; that is, it depends on the local time zone. It can be thought of as a single
number that encompasses all the date-related fields.</td>
</tr>
<tr>
<th rowspan="11" style="vertical-align: top; text-align: left">week<br>
day</th>
<td rowspan="3" style="text-align: center; vertical-align: top">E</td>
<td style="text-align: center; vertical-align: top"><span><span class="changed">1..</span>3</span></td>
<td style="vertical-align: top; text-align: left"><span>Tues</span></td>
<td rowspan="3" style="vertical-align: top; text-align: left"><span>Day of week - Use one
through three letters for the short day, or four for the full name, or five for the narrow
name.</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span>4</span></td>
<td style="vertical-align: top; text-align: left"><span>Tuesday</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span>5</span></td>
<td style="vertical-align: top; text-align: left"><span>T</span></td>
</tr>
<tr>
<td rowspan="4" style="text-align: center; vertical-align: top">e</td>
<td style="text-align: center; vertical-align: top"><span>1..2</span></td>
<td style="vertical-align: top; text-align: left">2</td>
<td rowspan="4" style="vertical-align: top; text-align: left">Local day of week. Same as E
except <span class="changed">adds a</span> numeric value that will depend on the local
starting day of the week, using one or two letters. For this example, Monday is the first day
of the week.</td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span>3</span></td>
<td style="vertical-align: top; text-align: left"><span>Tues</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span>4</span></td>
<td style="vertical-align: top; text-align: left"><span>Tuesday</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span>5</span></td>
<td style="vertical-align: top; text-align: left"><span>T</span></td>
</tr>
<tr>
<td rowspan="4" style="text-align: center; vertical-align: top"><span class="changed">c</span></td>
<td style="text-align: center; vertical-align: top"><span class="changed"><span>1</span></span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">2</span></td>
<td rowspan="4" style="vertical-align: top; text-align: left"><span class="changed"><span><b>
Stand-Alone</b> local day of week - Use one letter for the local numeric value (same as 'e'),
three for the short day, or four for the full name, or five for the narrow name. </span>
</span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed"><span>3</span></span></td>
<td style="vertical-align: top; text-align: left"><span class="changed"><span>Tues</span></span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed"><span>4</span></span></td>
<td style="vertical-align: top; text-align: left"><span class="changed"><span>Tuesday</span></span></td>
</tr>
<tr>
<td style="text-align: center; vertical-align: top"><span class="changed"><span>5</span></span></td>
<td style="vertical-align: top; text-align: left"><span class="changed"><span>T</span></span></td>
</tr>
<tr>
<th>period</th>
<td style="text-align: center">a</td>
<td style="text-align: center"><span>1</span></td>
<td><span>AM</span></td>
<td><span>AM or PM</span></td>
</tr>
<tr>
<th rowspan="4">hour</th>
<td style="text-align: center">h</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>11</span></td>
<td><span>Hour [1-12]. </span></td>
</tr>
<tr>
<td style="text-align: center">H</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>13</span></td>
<td><span>Hour [0-23].</span></td>
</tr>
<tr>
<td style="text-align: center">K</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>0</span></td>
<td><span>Hour [0-11].</span></td>
</tr>
<tr>
<td style="text-align: center">k</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>24</span></td>
<td><span>Hour [1-24].</span></td>
</tr>
<tr>
<th>minute</th>
<td style="text-align: center">m</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>59</span></td>
<td><span>Minute. Use one or two for zero padding.</span></td>
</tr>
<tr>
<th rowspan="3">second</th>
<td style="text-align: center">s</td>
<td style="text-align: center"><span>1..2</span></td>
<td><span>12</span></td>
<td><span>Second. Use one or two for zero padding.</span></td>
</tr>
<tr>
<td style="text-align: center">S</td>
<td style="text-align: center"><span>1..n</span></td>
<td><span>345<span class="changedspan">7</span></span></td>
<td><span>Fractional Second - rounds to the count of letters. (example is for 12.34567)</span></td>
</tr>
<tr>
<td style="text-align: center">A</td>
<td style="text-align: center"><span>1..n</span></td>
<td>69540000</td>
<td>Milliseconds in day. This field behaves <i>exactly</i> like a composite of all
time-related fields, not including the zone fields. As such, it also reflects discontinuities
of those fields on DST transition days. On a day of DST onset, it will jump forward. On a day
of DST cessation, it will jump backward. This reflects the fact that is must be combined with
the offset field to obtain a unique local time value.</td>
</tr>
<tr>
<th rowspan="6" style="vertical-align: top; text-align: left">zone</th>
<td rowspan="2" style="vertical-align: top; text-align: left">z</td>
<td style="vertical-align: top; text-align: left"><span class="changed">1..</span>3</td>
<td style="vertical-align: top; text-align: left"><span class="changed">PDT</span></td>
<td rowspan="2" style="vertical-align: top; text-align: left"><span class="changed"><span>
Timezone - Use one to three letters for the short timezone or four for the full name. For more
information, see </span>Appendix J: <a href="#Time_Zone_Fallback"><span>Time Zone Display
Names</span></a></span></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left">4</td>
<td style="vertical-align: top; text-align: left">Pacific Daylight Time</td>
</tr>
<tr>
<td rowspan="2" style="vertical-align: top; text-align: left">Z</td>
<td style="vertical-align: top; text-align: left"><span class="changed">1..3</span></td>
<td style="vertical-align: top; text-align: left">-0800</td>
<td rowspan="2" style="vertical-align: top; text-align: left"><span>Use <span class="changed">
one to three letters</span> for RFC 822, <span class="changed">four letters</span> for GMT
format.</span></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left"><span class="changed">4</span></td>
<td style="vertical-align: top; text-align: left">GMT-08:00</td>
</tr>
<tr>
<td rowspan="2" style="vertical-align: top; text-align: left"><span class="changed">v</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">1</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">PT</span></td>
<td rowspan="2" style="vertical-align: top; text-align: left"><span class="changed"><span>Use
one letter for short wall (generic) time, four for long wall time. For more information, see
</span>Appendix J: <a href="#Time_Zone_Fallback"><span>Time Zone Display Names</span></a></span></td>
</tr>
<tr>
<td style="vertical-align: top; text-align: left"><span class="changed">4</span></td>
<td style="vertical-align: top; text-align: left"><span class="changed">Pacific Time</span></td>
</tr>
</table>
<p><span>All non-letter character represent themselves in a pattern, except for the single quote.
It is used to 'escape' letters. Two single quotes in a row, whether inside or outside a quoted
sequence, represent a 'real' single quote.</span></p>
<h3><a name="<localizedPatternChars>"><span>localizedPatternChars</span></a></h3>
<p><span>These are characters that can be used when displaying a date pattern to an end user. This
can occur, for example, when a spreadsheet allows users to specify date patterns. Whatever is in
the string is substituted one-for-one with the characters "</span><span class="tx"><span>GyMdkHmsSEDFwWahKzYe",
with the above meanings</span></span><span>. Thus, for example, if "J" is to be used instead of
"Y" to mean Year, then the string would be: "</span><span class="tx"><span>GyMdkHmsSEDFwWahKz<u>J</u>e".</span></span></p>
<p><span>This element is deprecated. It is recommended instead that a more sophisticated UI be
used for localization, such as using icons to represent the different formats (and lengths) in the
<a href="#Date_Field_Symbol_Table">Date Field Symbol Table</a>.</span></p>
<h3><span>AM / PM</span></h3>
<p><span>Even for countries where the customary date format only has a 24 hour format, both the am
and pm localized strings must be present and must be distinct from one another. Note that as long
as the 24 hour format is used, these strings will normally never be used, but for testing and
unusual circumstances they must be present.</span></p>
<h3><span>Eras</span></h3>
<p><span>There are only two values for an era in a Gregorian calendar, "BC" and "AD". These values
can be translated into other languages, like "</span><span>a.C</span><span>." and and "</span><span>d.C</span><span>."
for Spanish, but there are no other eras in the Gregorian calendar. Other calendars have a
different numbers of eras. Care should be taken when translating the era names for a specific
calendar.</span></p>
<h3><span>Week of Year</span></h3>
<p><span>Values calculated for the Week of Year field range from 1 to 53. Week 1 for a year is the
first week that contains at least the specified minimum number of days from that year. Weeks
between week 1 of one year and week 1 of the following year are numbered sequentially from 2 to 52
or 53 (if needed). For example, January 1, 1998 was a Thursday. If the first day of the week is
MONDAY and the minimum days in a week is 4 (these are the values reflecting ISO 8601 and many
national standards), then week 1 of 1998 starts on December 29, 1997, and ends on January 4, 1998.
However, if the first day of the week is SUNDAY, then week 1 of 1998 starts on January 4, 1998,
and ends on January 10, 1998. The first three days of 1998 are then part of week 53 of 1997.</span></p>
<p><span>Values are similarly calculated for the Week of Month.</span></p>
<h3><span>Week Elements</span></h3>
<dl>
<dt><b><span>firstDay</span></b><span> </span></dt>
<dd><span>A number indicating which day of the week is considered the 'first' day, for calendar
purposes. Because the ordering of days may vary between calendar, keywords are used for this
value, such as sun, </span><span>mon</span><span>,... These values will be replaced by the
localized name when they are actually used. </span></dd>
<dt><b><span>minDays (Minimal Days in First Week)</span></b><span> </span></dt>
<dd><span>Minimal days required in the first week of a month or year. For example, if the first
week is defined as one that contains at least one day, this value will be 1. If it must contain
a full seven days before it counts as the first week, then the value would be 7. </span></dd>
<dt><b><span>weekendStart, weekendEnd</span></b><span> </span></dt>
<dd><span>Indicates the day and time that the weekend starts or ends. As with </span><span>
firstDay</span><span>, keywords are used instead of numbers.</span></dd>
</dl>
<h2><span>Appendix G: <a name="Number_Format_Patterns">Number Format Patterns</a></span></h2>
<h3><span>Number Patterns</span></h3>
<p><span>The </span><a href="#NumberElements"><span>NumberElements</span></a><span> resource
affects how these patterns are interpreted in a localized context. Here are some examples, based
on the French locale. The "." shows where the decimal point should go. The "," shows where the
thousands separator should go. A "0" indicates zero-padding: if the number is too short, a zero
(in the locale's numeric set) will go there. A "#" indicates no padding: if the number is too
short, nothing goes there. A "¤" shows where the currency sign will go. The following illustrates
the effects of different patterns for the French locale, with the number "1234.567". Notice how
the pattern characters ',' and '.' are replaced by the characters appropriate for the locale.</span></p>
<blockquote>
<table cellSpacing="0" cellPadding="4" border="1">
<tr>
<th width="17%">Pattern</th>
<th width="16%">Currency</th>
<th width="33%">Text</th>
</tr>
<tr>
<td width="17%">#,##0.##</td>
<td width="16%"><i>n/a</i></td>
<td width="33%">1 234,57</td>
</tr>
<tr>
<td width="17%">#,##0.###</td>
<td width="16%"><i>n/a</i></td>
<td width="33%">1 234,567</td>
</tr>
<tr>
<td width="17%">###0.#####</td>
<td width="16%"><i>n/a</i></td>
<td width="33%">1234,567</td>
</tr>
<tr>
<td width="17%">###0.0000#</td>
<td width="16%"><i>n/a</i></td>
<td width="33%">1234,5670</td>
</tr>
<tr>
<td width="17%">00000.0000</td>
<td width="16%"><i>n/a</i></td>
<td width="33%">01234,5670</td>
</tr>
<tr>
<td width="17%" rowspan="2"># ##0.00 ¤</td>
<td width="16%">EUR</td>
<td width="33%">1 234,57 €</td>
</tr>
<tr>
<td width="16%"><span>JPY</span></td>
<td width="33%">1 235 ¥</td>
</tr>
</table>
</blockquote>
<p>The number of # placeholder characters before the decimal do not matter, since no limit is
placed on the maximum number of digits. There should, however, be at least one zero someplace in
the pattern. In currency formats, the number of digits after the decimal also do not matter, since
the information in the <a href="#Supplemental_Data">Appendix C: Supplemental Data</a> is used to
override the number of decimal places — and the rounding — according to the currency that is being
formatted. That can be seen in the above chart, with the difference between Yen and Euro
formatting.</p>
<h3><span>Special Pattern Characters</span></h3>
<p><span>Many characters in a pattern are taken literally; they are matched during parsing and
output unchanged during formatting. Special characters, on the other hand, stand for other
characters, strings, or classes of characters. For example, the '#' character is replaced by a
localized digit. Often the replacement character is the same as the pattern character; in the U.S.
locale, the ',' grouping character is replaced by ','. However, the replacement is still
happening, and if the symbols are modified, the grouping character changes. Some special
characters affect the behavior of the formatter by their presence; for example, if the percent
character is seen, then the value is multiplied by 100 before being displayed. </span></p>
<p><span>To insert a special character in a pattern as a literal, that is, without any special
meaning, the character must be quoted. There are some exceptions to this which are noted below.
</span></p>
<blockquote>
<table cellSpacing="3" cellPadding="0" summary="Chart showing symbol,
location, localized, and meaning." border="0">
<tr bgColor="#ccccff">
<th align="left">Symbol </th>
<th align="left"><span>Location </span></th>
<th align="left"><span>Localized? </span></th>
<th align="left"><span>Meaning </span></th>
</tr>
<tr vAlign="top">
<td>0 </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>Digit </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td>1-9 </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>'1' through '9' indicate rounding. </span></td>
</tr>
<tr vAlign="top">
<td>@ </td>
<td><span>Number </span></td>
<td><span>No </span></td>
<td><span>Significant digit </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td># </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>Digit, zero shows as absent </span></td>
</tr>
<tr vAlign="top">
<td>. </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>Decimal separator or monetary decimal separator </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td>- </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>Minus sign </span></td>
</tr>
<tr vAlign="top">
<td>, </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>Grouping separator </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td>E </td>
<td><span>Number </span></td>
<td><span>Yes </span></td>
<td><span>Separates mantissa and exponent in scientific notation. </span><em><span>Need not
be quoted in prefix or suffix.</span></em><span> </span></td>
</tr>
<tr vAlign="top">
<td>+ </td>
<td><span>Exponent </span></td>
<td><span>Yes </span></td>
<td><span>Prefix positive exponents with localized plus sign. </span><em><span>Need not be
quoted in prefix or suffix.</span></em><span> </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td>; </td>
<td><span>Subpattern boundary </span></td>
<td><span>Yes </span></td>
<td><span>Separates positive and negative subpatterns </span></td>
</tr>
<tr vAlign="top">
<td>% </td>
<td><span>Prefix or suffix </span></td>
<td><span>Yes </span></td>
<td><span>Multiply by 100 and show as percentage </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td>‰<br>
(\u2030)</td>
<td><span>Prefix or suffix </span></td>
<td><span>Yes </span></td>
<td><span>Multiply by 1000 and show as per mille </span></td>
</tr>
<tr vAlign="top">
<td>¤ (\u00A4) </td>
<td><span>Prefix or suffix </span></td>
<td><span>No </span></td>
<td><span>Currency sign, replaced by currency symbol. If doubled, replaced by international
currency symbol. <span>If tripled, uses the long form of the decimal symbol. </span>If
present in a pattern, the monetary decimal separator and grouping separators <span>(if
available) </span>are used instead of the numeric ones.</span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td>' </td>
<td><span>Prefix or suffix </span></td>
<td><span>No </span></td>
<td><span>Used to quote special characters in a prefix or suffix, for example, </span><code>
<span>"'#'#"</span></code><span> formats 123 to </span><code><span>"#123"</span></code><span>.
To create a single quote itself, use two in a row: </span><code><span>"# o''clock"</span></code><span>.
</span></td>
</tr>
<tr vAlign="top">
<td>* </td>
<td><span>Prefix or suffix boundary </span></td>
<td><span>Yes </span></td>
<td><span>Pad escape, precedes pad character </span></td>
</tr>
</table>
</blockquote>
<p><span>A pattern contains a </span><span>postive</span><span> and <span>may contain a </span>
negative </span><span>subpattern</span><span>, for example, "#,##0.00;(#,##0.00)". Each </span>
<span>subpattern</span><span> has a prefix, a numeric part, and a suffix. If there is no explicit
negative </span><span>subpattern</span><span>, the negative </span><span>subpattern</span><span>
is the localized minus sign prefixed to the positive </span><span>subpattern</span><span>. That
is, "0.00" alone is equivalent to "0.00;-0.00". If there is an explicit negative </span><span>
subpattern</span><span>, it serves only to specify the negative prefix and suffix; the number of
digits, minimal digits, and other characteristics are ignored in the negative </span><span>
subpattern</span><span>. That means that "#,##0.0#;(#)" has precisely the same result as
"#,##0.0#;(#,##0.0#)".</span></p>
<blockquote>
<p><span><b>Note: </b>The thousands separator and decimal separator in this pattern are always
',' and '.'. They are substituted by the code with the correct local values according to other
fields in CLDR.</span></p>
</blockquote>
<p><span>The prefixes, suffixes, and various symbols used for infinity, digits, thousands
separators, decimal separators, etc. may be set to arbitrary values, and they will appear properly
during formatting. </span><i><span>However, care must be taken that the symbols and strings do not
conflict, or parsing will be unreliable. </span></i><span>For example, either the positive and
negative prefixes or the suffixes must be distinct for any parser using this data to be able to
distinguish positive from negative values. Another example is that the decimal separator and
thousands separator should be distinct characters, or parsing will be impossible. </span></p>
<p><span>The </span><em><span>grouping separator</span></em><span> is a character that separates
clusters of integer digits to make large numbers more legible. It commonly used for thousands, but
in some locales it separates ten-thousands. The </span><em><span>grouping size</span></em><span>
is the number of digits between the grouping separators, such as 3 for "100,000,000" or 4 for "1
0000 0000". There are actually two different grouping sizes: One used for the least significant
integer digits, the </span><em><span>primary grouping size</span></em><span>, and one used for all
others, the </span><em><span>secondary grouping size</span></em><span>. In most locales these are
the same, but sometimes they are different. For example, if the primary grouping interval is 3,
and the secondary is 2, then this corresponds to the pattern "#,##,##0", and the number 123456789
is formatted as "12,34,56,789". If a pattern contains multiple grouping separators, the interval
between the last one and the end of the integer defines the primary grouping size, and the
interval between the last two defines the secondary grouping size. All others are ignored, so
"#,##,###,####" == "###,###,####" == "##,#,###,####".</span></p>
<p><span>When parsing using a number format, a more lenient parse should be used where possible.
In particular, it should implement at least the following rules.</span></p>
<blockquote>
<ul type="disc">
<li><span><span class="changedspan">I</span>f the separator is a NO BREAK SPACE, all input <span class="changedspan">in general category Zs</span><br> is treated as
matching<span class="changedspan">.</span></span></li>
<li><span><span class="changedspan">I</span>f the separator is an apostrophe (curly or straight), all input apostrophes<br>
are treated as matching<span class="changedspan">. The <character-fallback> supplemental data<br>can be used to implement this.</span></span></li>
<li><span><span class="changedspan">I</span>f the separator has a fullwidth or halfwidth equivalent, that is treated as<br>
matching.</span></li>
</ul>
</blockquote>
<p><span class="changedspan">For more on parsing, see <a href="#Lenient_Parsing">Lenient Parsing</a>.</span></p>
<p><span>For consistency in the CLDR data, the following conventions should be observed so as to
have a canonical representation:</span></p>
<ul>
<li><span>All number patterns should be minimal: there should be no leading # marks except to
specify the position of the grouping separators (e.g. avoid ##,##0.###).</span></li>
<li><span>All formats should have one 0 before the decimal point (e.g. avoid #,###.##) </span>
</li>
<li><span>Decimal formats should have three hash marks in the fractional position (e.g.
#,##0.###).</span></li>
<li><span>Currency formats should have two zeros in the fractional position (e.g. ¤ #,##0.00).</span><ul>
<li><span><span>The exact number of decimals is overridden with the decimal count in
supplementary data.</span></span></li>
</ul>
</li>
<li><span><span>The only time two thousands separators needs to be used is when the number of
digits varies, such as for Hindi: #,##,##0.</span></span></li>
</ul>
<h4><span>Formatting</span></h4>
<p><span>Formatting is guided by several parameters, all of which can be specified either using a
pattern or using the API. The following description applies to formats that do not use </span>
<a href="#sci"><span>scientific notation</span></a><span> or </span><a href="#sigdig"><span>
significant digits</span></a><span>. </span></p>
<ul>
<li><span>If the number of actual integer digits exceeds the </span><em><span>maximum integer
digits</span></em><span>, then only the least significant digits are shown. For example, 1997 is
formatted as "97" if the maximum integer digits is set to 2. </span></li>
<li><span>If the number of actual integer digits is less than the </span><em><span>minimum
integer digits</span></em><span>, then leading zeros are added. For example, 1997 is formatted
as "01997" if the minimum integer digits is set to 5. </span></li>
<li><span>If the number of actual fraction digits exceeds the </span><em><span>maximum fraction
digits</span></em><span>, then half-even rounding it performed to the maximum fraction digits.
For example, 0.125 is formatted as "0.12" if the maximum fraction digits is 2. This behavior can
be changed by specifying a rounding increment and a rounding mode. </span></li>
<li><span>If the number of actual fraction digits is less than the </span><em><span>minimum
fraction digits</span></em><span>, then trailing zeros are added. For example, 0.125 is
formatted as "0.1250" if the </span>mimimum<span> fraction digits is set to 4. </span></li>
<li><span>Trailing fractional zeros are not displayed if they occur </span><em><span>j</span></em><span>
positions after the decimal, where </span><em><span>j</span></em><span> is less than the maximum
fraction digits. For example, 0.10004 is formatted as "0.1" if the maximum fraction digits is
four or less. </span></li>
</ul>
<p><strong><span>Special Values</span></strong><span> </span></p>
<p><code><span>NaN</span></code><span> is represented as a single character, typically </span>
<font SIZE="3"><span></span></font><code><span>(\uFFFD)</span></code><span>. This character is
determined by the localized number symbols. This is the only value for which the prefixes and
suffixes are not used. </span></p>
<p><span>Infinity is represented as a single character, typically </span><font SIZE="3"><span>∞
</span></font><code><span>(\u221E)</span></code><span>, with the positive or negative prefixes and
suffixes applied. The infinity character is determined by the localized number symbols. </span>
</p>
<h4><span><a name="sci">Scientific Notation</a></span></h4>
<p><span>Numbers in scientific notation are expressed as the product of a mantissa and a power of
ten, for example, 1234 can be expressed as 1.234 x 10</span><sup><span>3</span></sup><span>. The
mantissa is typically in the half-open interval [1.0, 10.0) or sometimes [0.0, 1.0), but it need
not be. In a pattern, the exponent character immediately followed by one or more digit characters
indicates scientific notation. Example: "0.###E0" formats the number 1234 as "1.234E3". </span>
</p>
<ul>
<li><span>The number of digit characters after the exponent character gives the minimum exponent
digit count. There is no maximum. Negative exponents are formatted using the localized minus
sign, </span><em><span>not</span></em><span> the prefix and suffix from the pattern. This allows
patterns such as "0.###E0 m/s". To prefix positive exponents with a localized plus sign, specify
'+' between the exponent and the digits: "0.###E+0" will produce formats "1E+1", "1E+0", "1E-1",
etc. (In localized patterns, use the localized plus sign rather than '+'.) </span></li>
<li><span>The minimum number of integer digits is achieved by adjusting the exponent. Example:
0.00123 formatted with "00.###E0" yields "12.3E-4". This only happens if there is no maximum
number of integer digits. If there is a maximum, then the minimum number of integer digits is
fixed at one. </span></li>
<li><span>The maximum number of integer digits, if present, specifies the exponent grouping. The
most common use of this is to generate </span><em><span>engineering notation</span></em><span>,
in which the exponent is a multiple of three, e.g., "##0.###E0". The number 12345 is formatted
using "##0.####E0" as "12.345E3". </span></li>
<li><span>When using scientific notation, the formatter controls the digit counts using
significant digits logic. The maximum number of significant digits limits the total number of
integer and fraction digits that will be shown in the mantissa; it does not affect parsing. For
example, 12345 formatted with "##0.##E0" is "12.3E3". See the section on significant digits for
more details. </span></li>
<li><span>Exponential patterns may not contain grouping separators. </span></li>
</ul>
<h4><a name="sigdig">Significant Digits</a></h4>
<p><span>There are two ways of controlling how many digits are shows: (a) significant digits
counts, or (b) integer and fraction digit counts. Integer and fraction digit counts are described
above. When a formatter is using significant digits counts, the number of integer and fraction
digits is not specified directly, and the formatter settings for these counts are ignored.
Instead, the formatter uses however many integer and fraction digits are required to display the
specified number of significant digits. Examples: </span></p>
<blockquote>
<table cellSpacing="3" cellPadding="0" border="0">
<tr bgColor="#ccccff">
<th align="left"><span>Pattern </span></th>
<th align="left"><span>Minimum significant digits </span></th>
<th align="left"><span>Maximum significant digits </span></th>
<th align="left"><span>Number </span></th>
<th align="left"><span>Output</span></th>
</tr>
<tr vAlign="top">
<td><code><span>@@@</span></code><span> </span></td>
<td><span>3 </span></td>
<td><span>3 </span></td>
<td><span>12345 </span></td>
<td><code><span>12300</span></code><span> </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td><code><span>@@@</span></code><span> </span></td>
<td><span>3 </span></td>
<td><span>3 </span></td>
<td><span>0.12345 </span></td>
<td><code><span>0.123</span></code><span> </span></td>
</tr>
<tr vAlign="top">
<td><code><span>@@##</span></code><span> </span></td>
<td><span>2 </span></td>
<td><span>4 </span></td>
<td><span>3.14159 </span></td>
<td><code><span>3.142</span></code><span> </span></td>
</tr>
<tr vAlign="top" bgColor="#eeeeff">
<td><code><span>@@##</span></code><span> </span></td>
<td><span>2 </span></td>
<td><span>4 </span></td>
<td><span>1.23004 </span></td>
<td><code><span>1.23</span></code><span> </span></td>
</tr>
</table>
</blockquote>
<ul>
<li><span>In order to enable significant digits formatting, use a pattern containing the </span>
<code><span>'@'</span></code><span> pattern character. In order to disable significant digits
formatting, use a pattern that does not contain the </span><code><span>'@'</span></code><span>
pattern character.</span></li>
<li><span>Significant digit counts may be expressed using patterns that specify a minimum and
maximum number of significant digits. These are indicated by the </span><code><span>'@'</span></code><span>
and </span><code><span>'#'</span></code><span> characters. The minimum number of significant
digits is the number of </span><code><span>'@'</span></code><span> characters. The maximum
number of significant digits is the number of </span><code><span>'@'</span></code><span>
characters plus the number of </span><code><span>'#'</span></code><span> characters following on
the right. For example, the pattern </span><code><span>"@@@"</span></code><span> indicates
exactly 3 significant digits. The pattern </span><code><span>"@##"</span></code><span> indicates
from 1 to 3 significant digits. Trailing zero digits to the right of the decimal separator are
suppressed after the minimum number of significant digits have been shown. For example, the
pattern </span><code><span>"@##"</span></code><span> formats the number 0.1203 as </span><code>
<span>"0.12"</span></code><span>. </span></li>
<li><span>If a pattern uses significant digits, it may not contain a decimal separator, nor the
</span><code><span>'0'</span></code><span> pattern character. Patterns such as </span><code>
<span>"@00"</span></code><span> or </span><code><span>"@.###"</span></code><span> are
disallowed. </span></li>
<li><span>Any number of </span><code><span>'#'</span></code><span> characters may be </span>
<span>prepended</span><span> to the left of the leftmost </span><code><span>'@'</span></code><span>
character. These have no effect on the minimum and maximum significant digits counts, but may be
used to position grouping separators. For example, </span><code><span>"#,#@#"</span></code><span>
indicates a minimum of one significant digits, a maximum of two significant digits, and a
grouping size of three. </span></li>
<li><span>The number of significant digits has no effect on parsing. </span></li>
<li><span>Significant digits may be used together with exponential notation. Such patterns are
equivalent to a normal exponential pattern with a minimum and maximum integer digit count of
one, a minimum fraction digit count of </span><code><span>Minimum Significant Digits - 1</span></code><span>,
and a maximum fraction digit count of </span><code><span>Maximum Significant Digits - 1</span></code><span>.
For example, the pattern </span><code><span>"@@###E0"</span></code><span> is equivalent to
</span><code><span>"0.0###E0"</span></code><span>. </span></li>
</ul>
<h4><span>Padding</span></h4>
<p><span>Patterns support padding the result to a specific width. In a pattern the pad escape
character, followed by a single pad character, causes padding to be parsed and formatted. The pad
escape character is '*'. For example, </span><code><span>"$*x#,##0.00"</span></code><span> formats
123 to </span><code><span>"$xx123.00"</span></code><span>, and 1234 to </span><code><span>
"$1,234.00"</span></code><span>. </span></p>
<ul>
<li><span>When padding is in effect, the width of the positive </span><span>subpattern</span><span>,
including prefix and suffix, determines the format width. For example, in the pattern </span>
<code><span>"* #0 o''clock"</span></code><span>, the format width is 10. </span></li>
<li><span>Some parameters which usually do not matter have meaning when padding is used, because
the pattern width is significant with padding. In the pattern "*</span><font face="Lucida Sans Unicode"><span> </span></font><span>##,##,#,##0.##",
the format width is 14. The initial characters "##,##," do not affect the grouping size or
maximum integer digits, but they do affect the format width. </span></li>
<li><span>Padding may be inserted at one of four locations: before the prefix, after the prefix,
before the suffix, or after the suffix. No padding can be specified in any other location. If
there is no prefix, before the prefix and after the prefix are equivalent, likewise for the
suffix. </span></li>
<li><span>When specified in a pattern, the code point immediately following the pad escape is
the pad character. This may be any character, including a special pattern character. That is,
the pad escape </span><em><span>escapes</span></em><span> the following character. If there is
no character after the pad escape, then the pattern is illegal. </span></li>
</ul>
<p><strong><span>Rounding</span></strong><span> </span></p>
<p><span>Patterns support rounding to a specific increment. For example, 1230 rounded to the
nearest 50 is 1250. Mathematically, rounding to specific increments is performed by multiplying by
the increment, rounding to an integer, then dividing by the increment. To take a more bizarre
example, 1.234 rounded to the nearest 0.65 is 1.3, as follows:</span></p>
<table border="1" cellpadding="0" cellspacing="0" style="border-collapse: collapse">
<tr>
<th>Original:</th>
<td>1.234</td>
</tr>
<tr>
<th>Divide by increment (0.65):</th>
<td>1.89846...</td>
</tr>
<tr>
<th>Round:</th>
<td>2</td>
</tr>
<tr>
<th>Multiply by increment (0.65):</th>
<td>1.3</td>
</tr>
</table>
<p><span>To specify a rounding increment in a pattern, include the increment in the pattern
itself. "#,#50" specifies a rounding increment of 50. "#,##0.05" specifies a rounding increment of
0.05. </span></p>
<ul>
<li><span>Rounding only affects the string produced by formatting. It does not affect parsing or
change any numerical values. </span></li>
<li><span>An implementation may allow the specification of a </span><em><span>rounding mode</span></em><span>
to determine how values are rounded. In the absence of such choices, the default is to round
"half-even", as described in IEEE arithmetic. That is, it rounds towards the "nearest neighbor"
unless both neighbors are equidistant, in which case, it rounds towards the even neighbor.
Behaves as for round "half-up" if the digit to the left of the discarded fraction is odd;
behaves as for round "half-down" if it's even. Note that this is the rounding mode that
minimizes cumulative error when applied repeatedly over a sequence of calculations.</span></li>
<li><span>Some locales use rounding in their currency formats to reflect the smallest currency
denomination. </span></li>
<li><span>In a pattern, digits '1' through '9' specify rounding, but otherwise behave
identically to digit '0'. </span></li>
</ul>
<dl>
<dt> </dt>
<dt><b>decimalFormats</b></dt>
<dd>The normal locale specific way to write a base 10 number.</dd>
<dt><b><span>currencyFormats</span></b><span> </span></dt>
<dd><span>Use \u00A4 where the local currency symbol should be. Doubling the currency symbol
(\u00A4\u00A4) will output the international currency symbol (a 3-letter code). </span></dd>
<dt><b><span>percentFormats</span></b><span> </span></dt>
<dd><span>Pattern for use with percentage formatting </span></dd>
<dt><b><span>scientificFormats</span></b><span> </span></dt>
<dd><span>Pattern for use with scientific (exponent) formatting. </span></dd>
</dl>
<h3><b><span>Quoting rules</span></b></h3>
<blockquote>
<p><span>Single quotes, (</span><b><span>'</span></b><span>), enclose bits of the pattern that
should be treated literally. Inside a quoted string, two single quotes ('') are replaced with a
single one ('). For example: </span><tt><u><span>'X '</span></u><span>#</span><u><span>' Q '</span></u></tt><span>
-> </span><b><span>X 1939 Q </span></b><span>(Literal strings </span><u><span>underlined</span></u><span>.)</span></p>
</blockquote>
<h3><a name="NumberElements"><span>Number Elements</span></a></h3>
<blockquote>
<p><span>Localized symbols used in number formatting and parsing.</span></p>
</blockquote>
<dl>
<dt><b><span>decimal</span></b><span> </span></dt>
<dd><span>- separates the integer and fractional part of the number. </span></dd>
<dt><b><span>group</span></b><span> </span></dt>
<dd><span>- groups (for example) units of thousands: 10<sup>6</sup> = 1,000,000. The grouping
separator is commonly used for thousands, but in some countries for ten-thousands. The interval
is a constant number of digits between the grouping characters, such as 100,000,000 or
1,0000,0000. If you supply a pattern with multiple grouping characters, the interval between the
last one and the end of the integer is the one that is used. So "#,##,###,####" == "######,####"
== "##,####,####". </span></dd>
<dt><b><span>list</span></b><span> </span></dt>
<dd><span>- separates lists of numbers </span></dd>
<dt><b><span>percentSign</span></b><span> </span></dt>
<dd><span>- symbol used to indicate a percentage (1/100th) amount. (If present, the value is
also multiplied by 100 before formatting. That way 1.23 → 123%) </span></dd>
<dt><b><span>nativeZeroDigit</span></b><span> </span></dt>
<dd><span>- Symbol used to indicate a digit in the pattern, or zero if that place would
otherwise be empty. For example, with the digit of '0', the pattern "000" would format "34" as
"034", but the pattern "0" would format "34" as just "34". As well, the digits 1-9 are expected
to follow the code point of this specified 0 value. </span></dd>
<dt><b><span>patternDigit</span></b><span> </span></dt>
<dd><span>- Symbol used to indicate any digit value, typically #. When that digit is zero, then
it is not shown. </span></dd>
<dt><b><span>minusSign</span></b><span> </span></dt>
<dd><span>- Symbol used to denote negative value. </span></dd>
<dt><b><span>plusSign</span></b><span> </span></dt>
<dd><span>- Symbol used to denote negative value. </span></dd>
<dt><b><span>exponential</span></b><span> </span></dt>
<dd><span>- Symbol separating the mantissa and exponent values. </span></dd>
<dt><b><span>perMille</span></b><span> </span></dt>
<dd><span>- symbol used to indicate a per-mille (1/1000th) amount. (If present, the value is
also multiplied by 1000 before formatting. That way 1.23 → 1230 [1/000]) </span></dd>
<dt><b><span>infinity</span></b><span> </span></dt>
<dd><span>- The infinity sign. Corresponds to the IEEE infinity bit pattern. </span></dd>
<dt><b><span>nan - Not a number</span></b><span> </span></dt>
<dd><span>- The </span><span>NaN</span><span> sign. Corresponds to the IEEE </span><span>NaN</span><span>
bit pattern. </span></dd>
<dt><b><span>currencySeparator</span></b><span> </span></dt>
<dd><span>This is used as the decimal separator in currency formatting/parsing, instead of the
</span><span>DecimalSeparator</span><span> from the </span><a href="#NumberElements"><span>
NumberElements</span></a><span> list. This item is optional in the </span><span>CLDR</span><span>.
</span></dd>
<dt><b><span>currencyGroup</span></b><span> </span></dt>
<dd><span>This is used as the grouping separator in currency formatting/parsing, instead of the
</span><span>DecimalSeparator</span><span> from the </span><a href="#NumberElements"><span>
NumberElements</span></a><span> list. This item is optional in the </span><span>CLDR</span><span>.
</span></dd>
</dl>
<h2><span>Appendix H: <a name="Choice_Patterns">Choice Patterns</a></span></h2>
<p><span>A choice pattern is a string that chooses among a number of strings, based on numeric
value. It has the following form:</span></p>
<p><span><</span><span>choice_pattern</span><span>> = <choice> ( '|' <choice> )*<br>
<choice> = <number><relation><string><br>
<number> = ('+' | '-')? (</span><font SIZE="3"><span>'∞' | [0-9]+ ('.' [0-9]+)?)<br>
<relation> = '<' | '</span></font><span style="color: blue">≤'</span></p>
<p><span>The interpretation of a choice pattern is that given a number N, the pattern is scanned
from right to left, for each choice evaluating <number> <relation> N. The first choice that
matches results in the corresponding string. If no match is found, then the first string is used.
For example:</span></p>
<table border="1" cellpadding="0" cellspacing="0">
<tr>
<td width="33%"><span>Pattern</span></td>
<td width="33%"><span>N</span></td>
<td width="34%"><span>Result</span></td>
</tr>
<tr>
<td width="33%" rowspan="4"><span style="color: blue">0≤Rf|1≤Ru|1<Re</span></td>
<td width="33%"><span>-</span><font SIZE="3"><span>∞, </span></font><span>-3, -1, -0.000001</span></td>
<td width="34%"><span>Rf (defaulted to first string)</span></td>
</tr>
<tr>
<td width="33%"><span>0, 0.01, 0.9999</span></td>
<td width="34%"><span>Rf</span></td>
</tr>
<tr>
<td width="33%"><span>1</span></td>
<td width="34%"><span>Ru</span></td>
</tr>
<tr>
<td width="33%"><span>1.00001, 5, 99, </span><font SIZE="3"><span>∞</span></font></td>
<td width="34%"><span>Re</span></td>
</tr>
</table>
<p><span>Quoting is done using ' characters, as in date or number formats.</span></p>
<h2><span>Appendix I: <a name="Inheritance_and_Validity">Inheritance and Validity</a></span></h2>
<p><span>The following describes in more detail how to determine the exact inheritance of
elements, and the validity of a given element in </span><span>LDML</span><span>.</span></p>
<h3><span>Definitions</span></h3>
<p><span>Attributes that serve to distinguish multiple elements at the same level are called <i>
distinguishing</i> attributes. These currently consist of </span><span class="attribute">key</span><span>,
</span><span class="attribute">registry</span><span>, </span><span class="attribute">alt</span><span>,
and </span><span class="attribute">type</span><span> (except for the </span>
<span class="attribute">type</span><span> attribute on the elements </span><span class="element">
default</span><span> and </span><span class="element">mapping</span><span>).</span></p>
<p><span><i>Blocking</i> elements are those whose subelements do not inherit from parent locales.
For example, a <collation> element is a blocking element: everything in a <collation> element is
treated as a single lump of data, as far as inheritance is concerned.</span></p>
<p><span>Certain elements are called <i>attribute-information</i> elements. They do not have
element content; their information is carried in their attribute values. This is unlike the other
elements, whose attributes are used to distinguish different types of data.</span></p>
<p><span>A list of blocking and <i>attribute-information</i> elements is found in
<a href="#valid_attribute_values">Appendix K: Valid Attribute Values</a>.</span></p>
<p><span>For any element in an XML file, </span><i><span>an element chain</span></i><span> is a
resolved </span><span>XPath</span><span> leading from the root to an element, with attributes on
each element in alphabetical order. So in, say, </span>
<a href="http://unicode.org/cldr/data/common/main/el.xml"><span>
http://unicode.org/cldr/data/common/main/el.xml</span></a><span> we may have:</span></p>
<pre><span><</span><span>ldml</span><span> version="1.1">
<identity>
<version number="1.1" />
<generation date="2004-06-04" />
<language type="el" />
</identity>
<</span><span>localeDisplayNames</span><span>>
<languages>
<language type="</span><span>ar</span><span>">Αραβικά</language>
...</span></pre>
<p><span>Which gives the following element chains (among others):</span></p>
<ul>
<li><span>//</span><span>ldml[@version="1.1"]/identity/version[@number="1.1"]</span></li>
<li><span>//</span><span>ldml[@version="1.1"]/localeDisplayNames</span><span>/languages/language[@type="</span><span>ar"]</span></li>
</ul>
<p><span>An element chain A is an </span><i><span>extension</span></i><span> of an element chain B
if B is equivalent to an initial portion of A. For example, #2 below is an extension of #1.
(Equivalent, depending on the tree, may not be "identical to". See below for an example.)</span></p>
<ol>
<li><span>//</span><span>ldml[@version="1.1"]/localeDisplayNames</span></li>
<li><span>//</span><span>ldml[@version="1.1"]/localeDisplayNames</span><span>/languages/language[@type="</span><span>ar"]</span></li>
</ol>
<p><span>An </span><span>LDML</span><span> file can be thought of as an ordered list of </span><i>
<span>element pairs</span></i><span>: <element chain, data>, where the element chains are all the
chains for the end-nodes. (This works because of restrictions on the structure of </span><span>
LDML</span><span>, including that it doesn't allow mixed content.) The ordering is the ordering
that the element chains are found in the file, and thus determined by the </span><span>DTD</span><span>.</span></p>
<p><span>For example, some of those pairs would be the following. Notice that the first has the
null string as element contents.</span></p>
<ul>
<li><span><b><</b>//</span><span>ldml[@version="1.1"]/identity/version[@number="1.1"]<b>, </b>""<b>></b></span></li>
<li><span><b><</b>//</span><span>ldml[@version="1.1"]/localeDisplayNames</span><span>/languages/language[@type="</span><span>ar"]<b>,
</b>"Αραβικά"<b>></b></span></li>
</ul>
<blockquote>
<p><span><b>Note: </b>There are two exceptions to this:</span></p>
<ol>
<li><span>Blocking nodes and their contents are treated as a single end note. </span></li>
<li><span>For attribute-information elements, in terms of computing inheritance, the element
pair consists of the element chain minus the attributes in the final element and the value is
the list of attributes for that final element. </span></li>
</ol>
<blockquote>
<p><span>Thus instead of the element pair being (a) below, it is (b):</span></p>
<ol type="a">
<li><span><b><</b>//ldml[@version="1.1"]/dates/calendars/calendar[@type='gregorian']/week/weekendStart[@day='sun'][@time='00:00']<b>,</b><br>
<b>""></b></span></li>
<li><span><b><</b>//ldml[@version="1.1"]/dates/calendars/calendar[@type='gregorian']/week/weekendStart<b>,</b><br>
[@day='sun'][@time='00:00']<b>></b></span></li>
</ol>
</blockquote>
</blockquote>
<p><span>Two </span><span>LDML</span><span> element chains are </span><i><span>equivalent</span></i><span>
when they would be identical if all attributes and their values were removed<font face="Times New Roman">
—</font>except for distinguishing attributes. Thus the following are equivalent:</span></p>
<ul>
<li><span><code>//</code></span><code><span>ldml[@version="1.1"]/localeDisplayNames/languages/language[@type="</span></code><span><code>ar"]</code></span></li>
<li><span><code>//</code></span><code><span>ldml[@version="1.1"]/localeDisplayNames/languages/language[@type="</span></code><span><code>ar"][@draft="<span class="changedspan">unconfirmed</span>"]</code></span></li>
</ul>
<p><span>For any locale ID, an </span><i><span>locale chain</span></i><span> is an ordered list
starting with the root and leading down to the ID. For example:</span></p>
<blockquote>
<p><span><root, de, <span class="changedspan">de_DE, de_DE_xxx</span>></span></p>
</blockquote>
<h3><span>Resolved Data File</span></h3>
<p><span>To produce fully resolved locale data file from CLDR for a locale ID L, you start with
<span>L, and successively add unique</span> items from the <span>parent</span> locales until you
get <span>up to root</span>. More formally, this can be expressed as the following procedure.</span></p>
<ol>
<li><span>Let Result be initially empty.</span></li>
<li><span>For each Li in the locale chain for L<span>, starting at L and going up to root:</span></span><ol>
<li><span><span>Let Temp be a copy of the pairs in the LDML file for Li</span></span></li>
<li><span>Replace each alias in Temp by the list of pairs it points to.</span></li>
<li><span>For each element pair P in <span>Temp</span>:</span><ol>
<li><span>If </span><span><span>P does not contain a blocking element, and </span>Result
<span>does not </span>have an element pair Q with an equivalent element chain, <span>add P
to Result</span>.</span></li>
</ol>
</li>
</ol>
</li>
</ol>
<p><span>Note: when adding an element pair to a result, it has to go in the right order for it to
be valid according to the </span><span>DTD</span><span>.</span></p>
<h3><b><span>Valid Data</span></b></h3>
<p><span>The attribute </span><i><span>draft="<span class="changedspan">unconfirmed"
or draft="provisional</span>" </span></i><span> in </span><span>LDML</span><span>
means that the data has not been approved for release. However, some data that is not explicitly
marked as </span><i><span><span class="changedspan">unconfirmed or provisional</span> </span></i>
<span>
may be implicitly </span><i><span><span class="changedspan">unconfirmed or
provisional</span></span></i><span>, either because it inherits it from a parent, or from an
enclosing element. </span></p>
<p><b><span>Example 2. </span></b><span>Suppose that new locale data is added for </span><span>af</span><span>
(</span><span>Afrikans</span><span>). To indicate that all of the data is </span>
<span class="changedspan"><i><span>unconfirmed</span></i></span><span>, the attribute
can be added to the top level.</span></p>
<p><code><span><ldml version="1.1" draft="<span class="changedspan">unconfirmed</span>"><br>
<identity><br>
<version number="1.1" /> <br>
<generation date="2004-06-04" /> <br>
<language type="af" /> <br>
</identity><br>
<characters><span class="changedspan">...</characters></span><br>
<localeDisplayNames><span class="changedspan">...</localeDisplayNames></span><br>
</ldml></span></code></p>
<p><span>Any data can be added to that file, and the status will all be draft<span class="changedspan">="unconfirmed"</span>. Once an item is
vetted -- </span><i><span>whether it is inherited or explicitly in the file</span></i><span> --
then its status can be changed to <span class="changedspan"><i>approved</i></span>. This can be done either by leaving draft="<span class="changedspan">unconfirmed</span>" on
the enclosing element and marking the child with draft="<span class="changedspan">approved</span>", such as:</span></p>
<p><code><span><ldml version="1.1" draft="<span class="changedspan">unconfirmed</span>"><br>
<identity><br>
<version number="1.1" /> <br>
<generation date="2004-06-04" /> <br>
<language type="af" /> <br>
</identity><br>
<characters draft="</span></code><span><span class="changedspan"><code>approved</code></span></span><code><span>"><span class="changedspan">...</characters></span><br>
<localeDisplayNames><span class="changedspan">...</localeDisplayNames></span><br>
<dates/><br>
<numbers/><br>
<collations/><br>
</ldml></span></code></p>
<p><span>However, normally the draft status should be canonicalized, which
means it is pushed down to leaf nodes: see </span><i>Appendix L: <a href="#Canonical_Form">Canonical Form</a></i>.</p>
<blockquote>
<p><b><span>Note: </span></b><span>A missing draft attribute is </span><i><span>not</span></i><span>
the same as either a true or false value. A missing attribute means instead: </span><i><span>
inherit</span></i><span> the draft status from enclosing elements and parent locales.</span></p>
</blockquote>
<p><span>The attribute </span><i><span>validSubLocales</span></i><span> allows sublocales in a
given tree to be treated as though a file for them were present when there isn't one. It can be
applied to any element. It only has an effect for locales that inherit from the current file where
a file is missing, and the elements wouldn't otherwise be draft.</span></p>
<p><b><span>Example 1. </span></b><span>Suppose that in a particular </span><span>LDML</span><span>
tree, there are no region locales for German, e.g. there is a </span><span>de.xml</span><span>
file, but no files for de_AT.xml</span><span>, de_CH.xml</span><span>, or de_DE.xml</span><span>.
Then no elements are valid for any of those region locales. If we want to mark one of those files
as having valid elements, then we introduce an empty file, such as the following.</span></p>
<p><code><span><ldml version="1.1"><br>
<identity><br>
<version number="1.1" /> <br>
<generation date="2004-06-04" /> <br>
<language type="de" /> <br>
<territory type="AT" /> <br>
</identity><br>
</ldml></span></code></p>
<p><span>With the </span><i><span>validSubLocales</span></i><span> attribute, instead of adding
the empty files for de_AT.xml</span><span>, de_CH.xml</span><span>, and de_DE.xml</span><span>, in
the de file we can add to the parent locale a list of the child locales that should behave as if
files were present.</span></p>
<p><code><span><ldml version="1.1" validSubLocales="de_AT de_CH de_DE"><br>
<identity><br>
<version number="1.1" /> <br>
<generation date="2004-06-04" /> <br>
<language type="de" /> <br>
</identity><br>
...<br>
</ldml></span></code></p>
<p><span>More formally, here is how to determine whether data for an element chain E is implicitly
or explicitly draft, given a locale L. Sections 1, 2, and 4 are simply formalizations of what is
in </span><span>LDML</span><span> already. Item 3 adds the new element.</span></p>
<h4><span>Checking for Draft Status:</span></h4>
<ol>
<li><b><span>Parent Locale Inheritance</span></b><ol>
<li><span>Walk through the locale chain until you find a locale ID L' with a data file D. (L'
may equal L). </span></li>
<li><span>Produce the fully resolved data file D' for D.</span></li>
<li><span>In D', find the first element pair whose element chain E' is either equivalent to or
an extension of E.</span></li>
<li><span>If there is no such E', return </span><i><span>true</span></i></li>
<li><span>If E' is not equivalent to E, truncate E' to the length of E.</span></li>
</ol>
</li>
<li><b><span>Enclosing Element Inheritance</span></b><ol>
<li><span>Walk through the elements in E', from back to front.</span><ol>
<li><span>If you ever encounter draft=</span><i><span>x</span></i><span>, return </span><i>
<span>x</span></i></li>
</ol>
</li>
<li><span>If L' = L, return </span><i><span>false</span></i></li>
</ol>
</li>
<li><span><b>Missing File Inheritance</b></span><ol>
<li><span>Otherwise, walk again through the elements in E', from back to front.</span><ol>
<li><span>If you encounter a validSubLocales attribute:</span><ol>
<li><span>If L is in the attribute value, return <i>false</i></span></li>
<li><span>Otherwise return <i>true</i></span></li>
</ol>
</li>
</ol>
</li>
</ol>
</li>
<li><b><span>Otherwise</span></b><ol>
<li><span>Return </span><i><span>true</span></i></li>
</ol>
</li>
</ol>
<p><span>The </span><span>validSubLocales</span><span> in the most specific (farthest from root
file) locale file "wins" through the full resolution step (data from more specific files replacing
data from less specific ones).</span></p>
<h3><span>Keyword and Default Resolution</span></h3>
<p><span>When accessing data based on keywords, the following process is used. Consider the
following example:<br>
<br>
The locale 'de' has collation types A, B, C, and no <default> element<br>
The locale 'de_CH' has <default type='B'><br>
<br>
Here are the searches for various combinations.</span></p>
<table border="1" cellpadding="0" cellspacing="0">
<tr>
<td rowspan="4"><span>1.</span></td>
<td><span>de_CH</span></td>
<td><span>not found</span></td>
</tr>
<tr>
<td><span>de</span></td>
<td><span>not found</span></td>
</tr>
<tr>
<td><span>root</span></td>
<td><span>not found: so get the default type in de_CH</span></td>
</tr>
<tr>
<td><span>de@collation=B</span></td>
<td><i><span>found</span></i></td>
</tr>
<tr>
<td rowspan="4"><span>2.</span></td>
<td><span>de</span></td>
<td><span>not found</span></td>
</tr>
<tr>
<td><span>root</span></td>
<td><span>not found: so get the default type in de, which itself falls back to root</span></td>
</tr>
<tr>
<td><span>de@collation=standard</span></td>
<td><span>not found</span></td>
</tr>
<tr>
<td><span>root@collation=standard</span></td>
<td><i><span>found</span></i></td>
</tr>
<tr>
<td><span>3.</span></td>
<td><span>de@collation=A</span></td>
<td><i><span>found</span></i></td>
</tr>
<tr>
<td rowspan="2"><span>4.</span></td>
<td><span>de@collation=standard</span></td>
<td><span>not found</span></td>
</tr>
<tr>
<td><span>root@collation=standard</span></td>
<td><i><span>found</span></i></td>
</tr>
</table>
<p><span><br>
<b>Note: </b>It is an invariant that the default in root for a given element must<br>
always be a value that exists in root. So you can't have the following in root:</span></p>
<p><span><someElements><br>
<default type='a'/><br>
<someElement type='b'>...</someElement><br>
<someElement type='c'>...</someElement><br>
<b> <!-- no 'a' --></b><br>
</someElements></span></p>
<p><span>It is not necessary, but strongly encouraged, that the default type in root be<br>
'standard'.</span></p>
<p><span>For identifiers, such as language codes, script codes, region codes, variant codes,
types, keywords, currency symbols or currency display names, the default value is the identifier
itself whenever if no value is found in the root. Thus if there is no display name for the region
code 'QA' in root, then the display name is simply 'QA'. </span></p>
<h2><span>Appendix J: <a name="Time_Zone_Fallback">Time Zone <span class="changedspan">Display
Names</span></a></span></h2>
<p><span class="changedspan">There are three types of formats for zone identifiers: GMT, generic
(wall time), and standard/daylight. Standard and daylight are equivalent to a particular offset
from GMT, and can be represented by a GMT offset as a fallback. In general, this is not true for
the generic format, which is used for picking<br>
timezones or for conveying a timezone for specifying a recurring time (such as a meeting in a
calendar). For either purpose, a GMT offset would lose information.</span></p>
<p><span class="changedspan">When a timezone is to be displayed, the following process is used. It
uses explicit display names where they are available, and otherwise uses a fallback to GMT for
non-wall time (standard and daylight). For generic, it falls back to the exemplar city if
available, otherwise the country if possible, and otherwise the last field of the zone ID. Only the generic time (or its fallback) should be used in menus, in order to avoid possible collisions in the display names of standard and daylight time.</span></p>
<p><span class="changedspan">Each step is followed until a "return" is reached. Some of the
examples are drawn from real data, while for illustration the region format is "Tampo de {0}". The
fallback format is "{0} ({1})", which is what is in root. </span></p>
<ol>
<li><span class="changedspan">Canonicalize the </span><i><span class="changedspan">TZ</span></i><span class="changedspan">
ID according to the <timezoneData> table in supplemental data. Use that canonical TZID in each
of the following steps.</span><ul>
<li><span class="changedspan">America/Atka → America/Adak</span></li>
<li><span class="changedspan">Australia/ACT → Australia/Sydney</span></li>
</ul>
</li>
<li><span class="changedspan">For RFC 822 format ("Z") return the results according to the RFC.</span><ul>
<li><span class="changedspan">America/Los_Angeles → "-0800"</span></li>
</ul>
</li>
<li><span class="changedspan">If there is an explicit translation for the TZID according to type
(generic, standard, or daylight) in the resolved locale, return it.</span><ul>
<li><span class="changedspan">America/Los_Angeles → "Heure du Pacifique (ÉUA)" // generic</span></li>
<li><span class="changedspan">America/Los_Angeles → 太平洋標準時 // standard</span></li>
<li><span class="changedspan">America/Los_Angeles → Yhdysvaltain Tyynenmeren kesäaika //
daylight</span></li>
</ul>
<p><span class="changedspan"><b>Note: </b>This translation may not at all be literal: it would
be what is most recognizable for people using the target language.</span></li>
<li><span class="changedspan">For non-wall-time (ie, GMT, daylight, or standard) or where there
is no country for the TZID (eg, Etc/GMT+3), use the localized GMT format.</span><ul>
<li><span class="changedspan">America/Los_Angeles → "GMT-08:00" // standard time</span></li>
<li><span class="changedspan">America/Los_Angeles → "HMG-07:00" // daylight time</span></li>
<li><span class="changedspan">Etc/GMT+3 → "GMT-03.00" // note that </span><i>
<span class="changedspan">TZ</span></i><span class="changedspan"> tzids have inverse polarity!</span></li>
</ul>
</li>
<li><span class="changedspan">Thus the remaining steps are only applicable to the generic
format. In these steps, use as the country name the an explicitly localized country if
available, otherwise the raw country code. If the localized exemplar city is not available, use
as the exemplar city the last field of the raw TZID, stripping off the prefix and turning _ into
space.</span><ul>
<li><span class="changedspan">CU → "CU" // no localized country name for Cuba</span></li>
<li><span class="changedspan">America/Los_Angeles → "Los Angeles" // no localized exemplar
city</span></li>
</ul>
</li>
<li><span class="changedspan">From <timezoneData> get the country code for the zone, and
determine whether there is only one timezone in the country. If there is only one timezone or
the zone id is in the singleCountries list, format the country name with the region format, and
return it.</span><ul>
<li><span class="changedspan">Africa/Monrovia → LR → "Tampo de Liberja"</span></li>
<li><span class="changedspan">America/Havana → CU → "Tampo de CU" // if CU is not localized</span></li>
</ul>
<p><span class="changedspan"><b>Note: </b>If a language does require grammatical changes when
composing strings, then it should either use a neutral format such as what is in root, or put
all exceptional cases in explicitly translated strings.</span></li>
<li><span class="changedspan">Get the exemplar city and country name, and format them with the
fallback format (as parameters 0 and 1, respectively).</span><ul>
<li><span class="changedspan">America/Buenos_Aires → "Буэнос-Айрес (Аргентина)"</span></li>
<li><span class="changedspan">America/Buenos_Aires → "Буэнос-Айрес (AR)" // if Argentina isn't
translated</span></li>
<li><span class="changedspan">America/Buenos_Aires → "Buenos Aires (Аргентина)" // if Buenos
Aires isn't</span></li>
<li><span class="changedspan">America/Buenos_Aires → "Buenos Aires (AR)" // if both aren't</span></li>
</ul>
<p><span class="changedspan"><b>Note: </b>As with the region format, exceptional cases need to
be explicitly translated.</span></li>
</ol>
<p><span class="changedspan">In parsing, an implementation will be able to either determine the
zone id, or a simple offset from GMT for anything formatting according to the above process. The
following process should be used, stopping in the first step that matches.</span></p>
<ol>
<li><span class="changedspan">Check for explicitly localized strings.</span><ul>
<li><span class="changedspan">"Tampo de Pacifica" → America/Los_Angeles</span></li>
</ul>
</li>
<li><span class="changedspan">2. Check for RFC 822 and localized GMT formats</span><ul>
<li><span class="changedspan">"-0800" → Etc/GMT+8</span></li>
<li><span class="changedspan">"GMT-03:00" → Etc/GMT+3</span></li>
</ul>
</li>
<li><span class="changedspan">Check for <city, country> using the fallback format. Remember to
check for fallback localizations (last field of zone id and the raw country code).</span><ul>
<li><span class="changedspan">“Sydney (Australia)” → Australia/Sydney</span></li>
</ul>
</li>
<li><span class="changedspan">Check for localized <country> using the region format. Remember to
check for fallback localizations (raw country code).</span><ul>
<li><span class="changedspan">"Tampo de CU" → America/Havana</span></li>
</ul>
</li>
</ol>
<p><span class="changedspan">Using this process, a correct parse will roundtrip the generic format
(v and vvvv) back to the canonical zoneid.</span></p>
<ul>
<li><span class="changedspan">Australia/ACT → Australia/Sydney → “Sydney (Australia)” →
Australia/Sydney</span></li>
</ul>
<p><span class="changedspan">The GMT formats (Z and ZZZZ) will return back an offset, and thus
lose the original canonical zone id.</span></p>
<ul>
<li><span class="changedspan">Australia/ACT → Australia/Sydney → "GMT+11:00" → GMT+11</span></li>
</ul>
<p><span class="changedspan">The daylight and standard time formats (z and zzzz) may either
roundtrip back to the original canonical zone id, or to just an offset, depending on the available
translation data. Thus:</span></p>
<ul>
<li><span class="changedspan">Australia/ACT → Australia/Sydney → "GMT+11:00" → GMT+11</span></li>
<li><span class="changedspan">PST8PDT → America/Los_Angeles → “PST” → America/Los_Angeles</span></li>
</ul>
<p><span class="changedspan">Parsing can be more lenient than the above, allowing for different
spacing, punctuation, or other variation.</span></p>
<p>Many time zone IDs only represent differences that are important historically, but do not make
any difference in modern times. The preferenceOrdering element can be used to select the preferred
modern IDs when desired<span class="changedspan">, either in presenting a list of localized
timezone names in a user interface, or in formatting</span>. (The choice of the period to use
<span class="changedspan">as "modern"</span> when determining when two time zone IDs are
equivalent is left to the implementation.)</p>
<p>Whenever two timezone IDs are equivalent in effect and are in the same country, the preference
ordering list is examined according to the following process. <span class="changedspan">When used
in formatting, this process is used to add additional canonicalization in Step 1 above.</span></p>
<ol>
<li>If x, y are in the list, then the earlier one in the list is preferred. </li>
<li>Else if x is in the list and y isn't, then x is preferred </li>
<li>Else if not in root, repeat #1 and #2 using the parent locale's list </li>
<li>If all else fails, use a case-insensitive comparison of the timezone IDs. </li>
</ol>
<p>For example, the following table lists the modern equivalents for Mexico on separate rows. If
the preference ordering has one element: "America/Mexico_City", then the bolded items would be
chosen as the preferred timezone IDs.</p>
<table cellSpacing="0" cellPadding="3" border="1">
<tr>
<td>America/Merida, <b>America/Mexico_City, </b>America/Monterrey, America/Cancun</td>
</tr>
<tr>
<td><b>America/Chihuahua, </b>America/Mazatlan</td>
</tr>
<tr>
<td><b>America/Hermosillo</b></td>
</tr>
<tr>
<td><b>America/Tijuana</b></td>
</tr>
</table>
<p><span class="changedspan"><b>Note: </b>The hoursFormat and abbreviationFallback used in earlier
versions of this appendix are deprecated. </p>
</span>
<h2><span><a name="valid_attribute_values"></a>Appendix K: Valid Attribute Values</span></h2>
<p><span class="changedspan">The valid attribute values, as well as other validity information is
contained in the metadata.xml file. (Some, but not all, of this information could have been
represented in XML Schema or a DTD.)</span></p>
<p><span class="changedspan"><metadata></span></p>
<p><span class="changedspan"><br>
<i>The following specify the ordering of elements / attributes in the file</i><br>
<elementOrder>ldml identity alias localeDisplayNames layout ...</elementOrder><br>
<attributeOrder>type key registry alt source path day date...</attributeOrder></span></p>
<p><span class="changedspan"><i>The suppress elements are those that are
suppressed in canonicalization.</i></span></p>
<p><span class="changedspan"><i>The serialElements are those that do not inherit,
and may have ordering</i><br>
<span style="background-color: #FFFF00"><serialElements>variable comment
tRule reset p pc s sc t tc q qc i ic x extend first_variable last_variable
first_tertiary_ignorable last_tertiary_ignorable first_secondary_ignorable
last_secondary_ignorable first_primary_ignorable last_primary_ignorable
first_non_ignorable last_non_ignorable first_trailing last_trailing<br>
</serialElements></span></span></p>
<p><span class="changedspan"><i>The validity elements give the possible attribute values.
They are in the format of a series of variables, followed by attributeValues.
</i></span></p>
<p><i><span class="changedspan"><span style="background-color: #FFFF00">
<variable id="$calendar" type="choice"><br>
buddhist coptic ethiopic chinese gregorian hebrew islamic islamic-civil
japanese arabic civil-arabic thai-buddhist persian<br>
</variable></span></span></i></p>
<p><span class="changedspan">The types indicate the style of match:</span></p>
<ul>
<li><span class="changed">choice: for a list of possible values</span></li>
<li><span class="changed">regex: for a regular expression match</span></li>
<li><span class="changed">notDoneYet: for items without matching
criteria</span></li>
<li><span class="changed">locale: for locale IDs</span></li>
<li><span class="changed">list: for a space-delimited list of values</span></li>
<li><span class="changed">path: for a valid XPath</span></li>
</ul>
<p><span class="changedspan">If the attribute order="given" is supplied, it
indicates the order of elements when canonicalizing (see below).</span></p>
<p><span class="changedspan">The <deprecated> element lists elements,
attributes, and attribute values that are deprecated. If any deprecatedItems
element contains more than one attribute, then only the listed combinations
are deprecated. Thus the following means not that the draft attribute is
deprecated, but that the true and false values for that attribute are:</span></p>
<pre><span class="changedspan"><deprecatedItems attributes="draft" values="true false"/> </span></pre>
<p><span class="changedspan"> Similarly, the following means that the
<i>type</i> attribute is deprecated, but only for the listed elements:</span></p>
<pre><span class="changedspan"><deprecatedItems elements="abbreviationFallback default ... preferenceOrdering" attributes="type"/> </span></pre>
<p><span class="removedspan"><span>The following list provides a list of the currently valid
attributes and values, for all elements except </span>collation<span> <</span>rules<span>> and its
subelements. For currency codes, this list only includes the ISO 4217 currency codes; however,
there are other codes in common use that may occur in LDML data. A definitive list will be
provided in the future. In addition, data marked </span>draft<span>="</span>true<span>" may use
attribute values that are not in the following list, to allow for proposed additions, such as new
variant codes.</span></span></p>
<p><span class="removedspan"><span>In the following list, the <i>attribute-information</i>
elements (</span>alias<span>, </span>default<span>, </span>firstDay<span>, </span>mapping<span>,
</span>measurementSystem<span>, </span>minDays<span>, </span>orientation<span>, </span>settings<span>,
</span>weekendStart<span>, </span>weekendEnd<span>) are marked with a *. These elements carry all
of their information in attributes, and are to have empty element contents.</span></span></p>
<p><span class="removedspan"><span>There are currently only two <i>blocking</i> elements: </span>
collation<span> and </span>identity<span>. </span></span></p>
<p><span class="removedspan"><span>The <i>distinguishing</i> attributes are not marked in the
table, but currently consist of </span>key<span>, </span>registry<span>, </span>alt<span>, and
</span>type<span> (except for the </span>type<span> attribute on the elements </span>default<span>
and </span>mapping<span>).</span></span></p>
<table>
<tr>
<th><span class="removedspan"><span>Element</span></span></th>
<th><span class="removedspan"><span>Attribute</span></span></th>
<th><span class="removedspan"><span>Allowed Values</span></span></th>
</tr>
<tr>
<td rowSpan="5"><span class="removedspan"><span>[any]</span></span></td>
<!-- 2, 5-->
<td><span class="removedspan"><span>alt</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>proposed, variant, list</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>draft</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>true, false*</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>references</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><list of references></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>standard</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><list of standards> <i>(deprecated)</i></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>validSubLocales</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><list of sub-locales></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>abbreviationFallback</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>standard, GMT</span></span></td>
</tr>
<tr>
<td rowSpan="2"><span class="removedspan"><span>alias*</span></span></td>
<!-- 1, 2-->
<td><span class="removedspan"><span>path</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><valid XPath within locale tree></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>source</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><valid locale ID></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>calendar</span></span></td>
<!-- 10, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 10-->
<td><span class="removedspan"><span>buddhist, chinese, gregorian, hebrew, islamic, islamic-civil,
japanese, arabic[alias], civil-arabic[alias], thai-buddhist[alias], persian</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>collation</span></span></td>
<!-- 8, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 8-->
<td><span class="removedspan">big5han, digits-after, direct, gb2312han, phonebook, pinyin, <b>
standard</b>, stroke, traditional</span></td>
</tr>
<tr>
<td><span class="removedspan"><span>currency</span></span></td>
<!-- 202, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 202-->
<td><span class="removedspan">ADP, AED, AFA, AFN, ALL, AMD, ANG, AOA, AOK, AON, AOR, ARA, ARP,
ARS, ATS, AUD, AWG, AZM, BAD, BAM, BBD, BDT, BEC, BEF, BEL, BGL, BGN, BHD, BIF, BMD, BND, BOB,
BOP, BOV, BRB, BRC, BRE, BRL, BRN, BRR, BSD, BTN, BUK, BWP, BYB, BYR, BZD, CAD, CDF, CHE, CHF,
CHW, CLF, CLP, CNY, COP, COU, CRC, CSD, CSK, CUP, CVE, CYP, CZK, DDM, DEM, DJF, DKK, DOP, DZD,
ECS, ECV, EEK, EGP, EQE, ERN, ESA, ESB, ESP, ETB, EUR, FIM, FJD, FKP, FRF, GBP, GEK, GEL, GHC,
GIP, GMD, GNF, GNS, GQE, GRD, GTQ, GWE, GWP, GYD, HKD, HNL, HRD, HRK, HTG, HUF, IDR, IEP, ILP,
ILS, INR, IQD, IRR, ISK, ITL, JMD, JOD, JPY, KES, KGS, KHR, KMF, KPW, KRW, KWD, KYD, KZT, LAK,
LBP, LKR, LRD, LSL, LSM, LTL, LTT, LUC, LUF, LUL, LVL, LVR, LYD, MAD, MAF, MDL, MGA, MGF, MKD,
MLF, MMK, MNT, MOP, MRO, MTL, MTP, MUR, MVR, MWK, MXN, MXP, MXV, MYR, MZE, MZM, NAD, NGN, NIC,
NIO, NLG, NOK, NPR, NZD, OMR, PAB, PEI, PEN, PES, PGK, PHP, PKR, PLN, PLZ, PTE, PYG, QAR, RHD,
ROL, RUB, RUR, RWF, SAR, SBD, SCR, SDD, SDP, SEK, SGD, SHP, SIT, SKK, SLL, SOS, SRD, SRG, STD,
SUR, SVC, SYP, SZL, THB, TJR, TJS, TMM, TND, TOP, TPE, TRL, TRY, TTD, TWD, TZS, UAH, UAK, UGS,
UGX, USD, USN, USS, UYP, UYU, UZS, VEB, VND, VUV, WST, XAF, XAG, XAU, XBA, XBB, XBC, XBD, XCD,
XDR, XEU, XFO, XFU, XOF, XPD, XPF, XPT, XRE, XTS, XXX, YDD, YER, YUD, YUM, YUN, ZAL, ZAR, ZMK,
ZRN, ZRZ, ZWD, <private use></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>currencyFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>currencyFormatLength</span></span></td>
<!-- 4, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dateFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dateFormatLength</span></span></td>
<!-- 4, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dateTimeFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dateTimeFormatLength</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span>full</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>day</span></span></td>
<!-- 7, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 7-->
<td><span class="removedspan"><span>sun, mon, tue, wed, thu, fri, sat</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dayContext</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>format, stand-alone</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dayWidth</span></span></td>
<!-- 3, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 3-->
<td><span class="removedspan"><span>abbreviated, narrow, wide</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>decimalFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>decimalFormatLength</span></span></td>
<!-- 4, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>default*</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><any type value legal for one of the peer elements></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>era</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><non-negative number></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>exemplarCharacters</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, auxiliary</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>field</span></span></td>
<!-- 11, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 11-->
<td><span class="removedspan"><span>era, year, month, week, day, weekday, dayperiod, hour,
minute, second, zone</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>firstDay*</span></span></td>
<!-- 7, 1-->
<td><span class="removedspan"><span>day</span></span></td>
<!-- 7-->
<td><span class="removedspan"><span>sun, mon, tue, wed, thu, fri, sat</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>generation</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>date</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><date></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>inList</span></span></td>
<td><span class="removedspan"><span>casing</span></span></td>
<td><span class="removedspan"><span>titlecase-words, titlecase-firstword</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>key</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><any element name having 'type' attribute></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>language</span></span></td>
<!-- 478, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 478-->
<td><span class="removedspan">aa, ab, ace, ach, ada, ady, ae, af, afa, afh, ak, akk, ale, alg,
alt, am, an, ang, apa, ar, arc, arn, arp, art, arw, as, ast, ath, aus, av, awa, ay, az, ba,
bad, bai, bal, ban, bas, bat, be, bej, bem, ber, bg, bh, bho, bi, bik, bin, bla, bm, bn, bnt,
bo, br, bra, bs, btk, bua, bug, byn, ca, cad, cai, car, cau, ce, ceb, cel, ch, chb, chg, chk,
chm, chn, cho, chp, chr, chy, cmc, co, cop, cpe, cpf, cpp, cr, crh, crp, cs, csb, cu, cus, cv,
cy, da, dak, dar, day, de, del, den, dgr, din, doi, dra, dsb, dua, dum, dv, dyu, dz, ee, efi,
egy, eka, el, elx, en, enm, eo, es, et, eu, ewo, fa, fan, fat, ff, fi, fil, fiu, fj, fo, fon,
fr, frm, fro, fur, fy, ga, gaa, gay, gba, gd, gem, gez, gil, gl, gmh, gn, goh, gon, gor, got,
grb, grc, gu, gv, gwi, ha, hai, haw, he, hi, hil, him, hit, hmn, ho, hr, hsb, ht, hu, hup, hy,
hz, ia, iba, id, ie, ig, ii, ijo, ik, ilo, inc, ine, inh, io, ira, iro, is, it, iu, ja, jbo,
jpr, jrb, jv, ka, kaa, kab, kac, kam, kar, kaw, kbd, kg, kha, khi, kho, ki, kj, kk, kl, km,
kmb, kn, ko, kok, kos, kpe, kr, krc, kro, kru, ks, ku, kum, kut, kv, kw, ky, la, lad, lah,
lam, lb, lez, lg, li, ln, lo, lol, loz, lt, lu, lua, lui, lun, luo, lus, lv, mad, mag, mai,
mak, man, map, mas, mdf, mdr, men, mg, mga, mh, mi, mic, min, mis, mk, mkh, ml, mn, mnc, mni,
mno, mo, moh, mos, mr, ms, mt, mul, mun, mus, mwl, mwr, my, myn, myv, na, nah, nai, nap, nb,
nd, nds, ne, new, ng, nia, nic, niu, nl, nn, no, nog, non, nr, nso, nub, nv, nwc, ny, nym, nyn,
nyo, nzi, oc, oj, om, or, os, osa, ota, oto, pa, paa, pag, pal, pam, pap, pau, peo, phi, phn,
pi, pl, pon, pra, pro, ps, pt, qu, raj, rap, rar, rm, rn, ro, roa, rom, root, ru, rw, sa, sad,
sah, sai, sal, sam, sas, sat, sc, scn, sco, sd, se, sel, sem, sg, sga, sgn, sh, shn, si, sid,
sio, sit, sk, sl, sla, sm, sma, smi, smj, smn, sms, sn, snk, so, sog, son, sq, sr, srn, srr,
ss, ssa, st, su, suk, sus, sux, sv, sw, syr, ta, tai, te, tem, ter, tet, tg, th, ti, tig, tiv,
tk, tkl, tl, tlh, tli, tmh, tn, to, tog, tpi, tr, ts, tsi, tt, tum, tup, tut, tvl, tw, ty, tyv,
udm, ug, uga, uk, umb, und, ur, uz, vai, ve, vi, vo, vot, wa, wak, wal, war, was, wen, wo, xal,
xh, yao, yap, yi, yo, ypk, za, zap, zen, zh, znd, zu, zun, <private use></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>ldml</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>version</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span>1.0, 1.1, 1.2, 1.3</span></span></td>
</tr>
<tr>
<td rowSpan="2"><span class="removedspan"><span>mapping*</span></span></td>
<!-- 1, 2-->
<td><span class="removedspan"><span>registry</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><any charset registry, iana preferred></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><any valid charset from the given registry></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>measurementSystem*</span></span></td>
<!-- 3, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 3-->
<td><span class="removedspan"><span>metric, US, UK</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>minDays*</span></span></td>
<!-- 7, 1-->
<td><span class="removedspan"><span>count</span></span></td>
<!-- 7-->
<td><span class="removedspan"><span>1, 2, 3, 4, 5, 6, 7</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>month</span></span></td>
<!-- 13, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 13-->
<td><span class="removedspan"><span>1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>monthContext</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>format, stand-alone</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>monthWidth</span></span></td>
<!-- 3, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 3-->
<td><span class="removedspan"><span>abbreviated, narrow, wide</span></span></td>
</tr>
<tr>
<td rowSpan="2"><span class="removedspan"><span>orientation*</span></span></td>
<!-- 4, 2-->
<td><span class="removedspan"><span>characters</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span><b>left-to-right</b>, right-to-left, top-to-bottom,
bottom-to-top</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>lines</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>left-to-right, right-to-left, <b>top-to-bottom</b>,
bottom-to-top</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>pattern</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><b>standard</b>, <valid pattern for format></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>percentFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>percentFormatLength</span></span></td>
<!-- 4, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>preferenceOrdering</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><space-delimited list of timezone IDs></span></span></td>
</tr>
<tr>
<td rowspan="2"><span class="removedspan">reference</span></td>
<td><span class="removedspan">type</span></td>
<td><span class="removedspan"><token></span></td>
</tr>
<tr>
<td><span class="removedspan">uri</span></td>
<td><span class="removedspan"><reference URI></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>relative</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><positive or negative integer></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>reset</span></span></td>
<!-- 3, 1-->
<td><span class="removedspan"><span>before</span></span></td>
<!-- 3-->
<td><span class="removedspan"><span>primary, secondary, tertiary</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>scientificFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>scientificFormatLength</span></span></td>
<!-- 4, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>script</span></span></td>
<!-- 103, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 103-->
<td><span class="removedspan">Arab, Armn, Bali, Batk, Beng, Blis, Bopo, Brah, Brai, Bugi, Buhd,
Cans, Cham, Cher, Cirt, Copt, Cprt, Cyrl, Cyrs, Deva, Dsrt, Egyd, Egyh, Egyp, Ethi, Geok, Geor,
Glag, Goth, Grek, Gujr, Guru, Hang, Hani, Hano, Hans, Hant, Hebr, Hira, Hmng, Hrkt, Hung, Inds,
Ital, Java, Kali, Kana, Khar, Khmr, Knda, Laoo, Latf, Latg, Latn, Lepc, Limb, Lina, Linb, Mand,
Maya, Mero, Mlym, Mong, Mymr, Nkoo, Ogam, Orkh, Orya, Osma, Perm, Phag, Phnx, Plrd, Qaai, Roro,
Runr, Sara, Shaw, Sinh, Sylo, Syrc, Syre, Syrj, Syrn, Tagb, Tale, Talu, Taml, Telu, Teng, Tfng,
Tglg, Thaa, Thai, Tibt, Ugar, Vaii, Visp, Xpeo, Xsux, Yiii, Zxxx, Zyyy, Zzzz, <private use></span></td>
</tr>
<tr>
<td rowSpan="8"><span class="removedspan"><span>settings*</span></span></td>
<!-- 2, 8-->
<td><span class="removedspan"><span>alternate</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>non-ignorable, shifted</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>backwards</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>on, off</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>caseFirst</span></span></td>
<!-- 3-->
<td><span class="removedspan"><span>upper, lower, off</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>caseLevel</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>on, off</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>hiraganaQuarternary</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>on, off</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>normalization</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>on, off</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>numeric</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span>on, off</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>strength</span></span></td>
<!-- 5-->
<td><span class="removedspan"><span>primary, secondary, tertiary, quaternary, identical</span></span></td>
</tr>
<tr>
<td><span class="removedspan">singleCountries</span></td>
<td><span class="removedspan">list</span></td>
<td><span class="removedspan"><tzid: see zone:type></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>territory</span></span></td>
<!-- 268, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 268-->
<td><span class="removedspan">001, 002, 003, 005, 009, 011, 013, 014, 015, 017, 018, 019, 021,
029, 030, 035, 039, 053, 054, 057, 061, 062, 142, 145, 150, 151, 154, 155, 172, 200, 419, 830,
833, AD, AE, AF, AG, AI, AL, AM, AN, AO, AQ, AR, AS, AT, AU, AW, AX, AZ, BA, BB, BD, BE, BF,
BG, BH, BI, BJ, BM, BN, BO, BQ, BR, BS, BT, BV, BW, BY, BZ, CA, CC, CD, CF, CG, CH, CI, CK,
CL, CM, CN, CO, CR, CS, CT, CU, CV, CX, CY, CZ, DD, DE, DJ, DK, DM, DO, DZ, EC, EE, EG, EH,
ER, ES, ET, FI, FJ, FK, FM, FO, FQ, FR, FX, GA, GB, GD, GE, GF, GH, GI, GL, GM, GN, GP, GQ, GR,
GS, GT, GU, GW, GY, HK, HM, HN, HR, HT, HU, ID, IE, IL, IN, IO, IQ, IR, IS, IT, JM, JO, JP,
JT, KE, KG, KH, KI, KM, KN, KP, KR, KW, KY, KZ, LA, LB, LC, LI, LK, LR, LS, LT, LU, LV, LY,
MA, MC, MD, MG, MH, MI, MK, ML, MM, MN, MO, MP, MQ, MR, MS, MT, MU, MV, MW, MX, MY, MZ, NA,
NC, NE, NF, NG, NI, NL, NO, NP, NQ, NR, NT, NU, NZ, OM, PA, PC, PE, PF, PG, PH, PK, PL, PM, PN,
PR, PS, PT, PU, PW, PY, PZ, QA, QO, RE, RO, RU, RW, SA, SB, SC, SD, SE, SG, SH, SI, SJ, SK,
SL, SM, SN, SO, SR, ST, SU, SV, SY, SZ, TC, TD, TF, TG, TH, TJ, TK, TL, TM, TN, TO, TR, TT,
TV, TW, TZ, UA, UG, UM, US, UY, UZ, VA, VC, VD, VE, VG, VI, VN, VU, WF, WK, WS, YD, YE, YT,
YU, ZA, ZM, ZW, <private use></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>timeFormat</span></span></td>
<!-- 2, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 2-->
<td><span class="removedspan"><span><b>standard</b>, <special-key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>timeFormatLength</span></span></td>
<!-- 4, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 4-->
<td><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td rowSpan="2"><span class="removedspan"><span>type</span></span></td>
<!-- 1, 2-->
<td><span class="removedspan"><span>key</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><any element name having 'type' attribute></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>type</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><any type value--with appropriate key></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>variant</span></span></td>
<!-- 13, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 13-->
<td><span class="removedspan">1901, 1996, POLYTONI, POSIX, REVISED, SAAHO, boont, gaulish,
guoyu, hakka, lojban, nedis, rozaj, scouse, xiang</span></td>
</tr>
<tr>
<td><span class="removedspan"><span>version</span></span></td>
<!-- 1, 1-->
<td><span class="removedspan"><span>number</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><revision></span></span></td>
</tr>
<tr>
<td rowSpan="2"><span class="removedspan"><span>weekendEnd*</span></span></td>
<!-- 7, 2-->
<td><span class="removedspan"><span>day</span></span></td>
<!-- 7-->
<td><span class="removedspan"><span>sun, mon, tue, wed, thu, fri, sat</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>time</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><HH:mm (00:00..<b>24:00</b>)></span></span></td>
</tr>
<tr>
<td rowSpan="2"><span class="removedspan"><span>weekendStart*</span></span></td>
<!-- 7, 2-->
<td><span class="removedspan"><span>day</span></span></td>
<!-- 7-->
<td><span class="removedspan"><span>sun, mon, tue, wed, thu, fri, sat</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>time</span></span></td>
<!-- 1-->
<td><span class="removedspan"><span><HH:mm (<b>00:00</b>..24:00)></span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>zone</span></span></td>
<!-- 401, 1-->
<td><span class="removedspan"><span>type</span></span></td>
<!-- 401-->
<td><span class="removedspan"><span>Africa/Abidjan, Africa/Accra, Africa/Addis_Ababa,
Africa/Algiers, Africa/Asmera, Africa/Bamako, Africa/Bangui, Africa/Banjul, Africa/Bissau,
Africa/Blantyre, Africa/Brazzaville, Africa/Bujumbura, Africa/Cairo, Africa/Casablanca,
Africa/Ceuta, Africa/Conakry, Africa/Dakar, Africa/Dar_es_Salaam, Africa/Djibouti, Africa/Douala,
Africa/El_Aaiun, Africa/Freetown, Africa/Gaborone, Africa/Harare, Africa/Johannesburg,
Africa/Kampala, Africa/Khartoum, Africa/Kigali, Africa/Kinshasa, Africa/Lagos,
Africa/Libreville, Africa/Lome, Africa/Luanda, Africa/Lubumbashi, Africa/Lusaka,
Africa/Malabo, Africa/Maputo, Africa/Maseru, Africa/Mbabane, Africa/Mogadishu,
Africa/Monrovia, Africa/Nairobi, Africa/Ndjamena, Africa/Niamey, Africa/Nouakchott,
Africa/Ouagadougou, Africa/Porto-Novo, Africa/Sao_Tome, Africa/Timbuktu, Africa/Tripoli,
Africa/Tunis, Africa/Windhoek, America/Adak, America/Anchorage, America/Anguilla,
America/Antigua, America/Araguaina, America/Aruba, America/Asuncion, America/Barbados,
America/Belem, America/Belize, America/Boa_Vista, America/Bogota, America/Boise, America/Buenos_Aires,
America/Cambridge_Bay, America/Cancun, America/Caracas, America/Catamarca, America/Cayenne,
America/Cayman, America/Chicago, America/Chihuahua, America/Cordoba, America/Costa_Rica,
America/Cuiaba, America/Curacao, America/Danmarkshavn, America/Dawson, America/Dawson_Creek,
America/Denver, America/Detroit, America/Dominica, America/Edmonton, America/Eirunepe,
America/El_Salvador, America/Fortaleza, America/Glace_Bay, America/Godthab, America/Goose_Bay,
America/Grand_Turk, America/Grenada, America/Guadeloupe, America/Guatemala, America/Guayaquil,
America/Guyana, America/Halifax, America/Havana, America/Hermosillo, America/Indiana/Knox,
America/Indiana/Marengo, America/Indiana/Vevay, America/Indianapolis, America/Inuvik, America/Iqaluit,
America/Jamaica, America/Jujuy, America/Juneau, America/Kentucky/Monticello, America/La_Paz,
America/Lima, America/Los_Angeles, America/Louisville, America/Maceio, America/Managua,
America/Manaus, America/Martinique, America/Mazatlan, America/Mendoza, America/Menominee,
America/Merida, America/Mexico_City, America/Miquelon, America/Monterrey, America/Montevideo,
America/Montreal, America/Montserrat, America/Nassau, America/New_York, America/Nipigon,
America/Nome, America/Noronha, America/North_Dakota/Center, America/Panama, America/Pangnirtung,
America/Paramaribo, America/Phoenix, America/Port-au-Prince, America/Port_of_Spain, America/Porto_Velho,
America/Puerto_Rico, America/Rainy_River, America/Rankin_Inlet, America/Recife,
America/Regina, America/Rio_Branco, America/Santiago, America/Santo_Domingo, America/Sao_Paulo,
America/Scoresbysund, America/St_Johns, America/St_Kitts, America/St_Lucia, America/St_Thomas,
America/St_Vincent, America/Swift_Current, America/Tegucigalpa, America/Thule, America/Thunder_Bay,
America/Tijuana, America/Tortola, America/Vancouver, America/Whitehorse, America/Winnipeg,
America/Yakutat, America/Yellowknife, Antarctica/Casey, Antarctica/Davis, Antarctica/DumontDUrville,
Antarctica/Mawson, Antarctica/McMurdo, Antarctica/Palmer, Antarctica/Rothera,
Antarctica/Syowa, Antarctica/Vostok, Asia/Aden, Asia/Almaty, Asia/Amman, Asia/Anadyr, Asia/Aqtau,
Asia/Aqtobe, Asia/Ashgabat, Asia/Baghdad, Asia/Bahrain, Asia/Baku, Asia/Bangkok, Asia/Beirut,
Asia/Bishkek, Asia/Brunei, Asia/Calcutta, Asia/Choibalsan, Asia/Chongqing, Asia/Colombo,
Asia/Damascus, Asia/Dhaka, Asia/Dili, Asia/Dubai, Asia/Dushanbe, Asia/Gaza, Asia/Harbin, Asia/Hong_Kong,
Asia/Hovd, Asia/Irkutsk, Asia/Jakarta, Asia/Jayapura, Asia/Jerusalem, Asia/Kabul, Asia/Kamchatka,
Asia/Karachi, Asia/Kashgar, Asia/Katmandu, Asia/Krasnoyarsk, Asia/Kuala_Lumpur, Asia/Kuching,
Asia/Kuwait, Asia/Macau, Asia/Magadan, Asia/Makassar, Asia/Manila, Asia/Muscat, Asia/Nicosia,
Asia/Novosibirsk, Asia/Omsk, Asia/Oral, Asia/Phnom_Penh, Asia/Pontianak, Asia/Pyongyang,
Asia/Qatar, Asia/Qyzylorda, Asia/Rangoon, Asia/Riyadh, Asia/Saigon, Asia/Sakhalin, Asia/Samarkand,
Asia/Seoul, Asia/Shanghai, Asia/Singapore, Asia/Taipei, Asia/Tashkent, Asia/Tbilisi,
Asia/Tehran, Asia/Thimphu, Asia/Tokyo, Asia/Ulaanbaatar, Asia/Urumqi, Asia/Vientiane,
Asia/Vladivostok, Asia/Yakutsk, Asia/Yekaterinburg, Asia/Yerevan, Atlantic/Azores,
Atlantic/Bermuda, Atlantic/Canary, Atlantic/Cape_Verde, Atlantic/Faeroe, Atlantic/Jan_Mayen,
Atlantic/Madeira, Atlantic/Reykjavik, Atlantic/South_Georgia, Atlantic/St_Helena,
Atlantic/Stanley, Australia/Adelaide, Australia/Brisbane, Australia/Broken_Hill,
Australia/Darwin, Australia/Hobart, Australia/Lindeman, Australia/Lord_Howe,
Australia/Melbourne, Australia/Perth, Australia/Sydney, Etc/GMT, Etc/GMT+1, Etc/GMT+10,
Etc/GMT+11, Etc/GMT+12, Etc/GMT+2, Etc/GMT+3, Etc/GMT+4, Etc/GMT+5, Etc/GMT+6, Etc/GMT+7,
Etc/GMT+8, Etc/GMT+9, Etc/GMT-1, Etc/GMT-10, Etc/GMT-11, Etc/GMT-12, Etc/GMT-13, Etc/GMT-14,
Etc/GMT-2, Etc/GMT-3, Etc/GMT-4, Etc/GMT-5, Etc/GMT-6, Etc/GMT-7, Etc/GMT-8, Etc/GMT-9, Etc/UCT,
Etc/UTC, Europe/Amsterdam, Europe/Andorra, Europe/Athens, Europe/Belfast, Europe/Belgrade,
Europe/Berlin, Europe/Bratislava, Europe/Brussels, Europe/Bucharest, Europe/Budapest,
Europe/Chisinau, Europe/Copenhagen, Europe/Dublin, Europe/Gibraltar, Europe/Helsinki,
Europe/Istanbul, Europe/Kaliningrad, Europe/Kiev, Europe/Lisbon, Europe/Ljubljana,
Europe/London, Europe/Luxembourg, Europe/Madrid, Europe/Malta, Europe/Minsk, Europe/Monaco,
Europe/Moscow, Europe/Oslo, Europe/Paris, Europe/Prague, Europe/Riga, Europe/Rome,
Europe/Samara, Europe/San_Marino, Europe/Sarajevo, Europe/Simferopol, Europe/Skopje,
Europe/Sofia, Europe/Stockholm, Europe/Tallinn, Europe/Tirane, Europe/Uzhgorod, Europe/Vaduz,
Europe/Vatican, Europe/Vienna, Europe/Vilnius, Europe/Warsaw, Europe/Zagreb, Europe/Zaporozhye,
Europe/Zurich, Indian/Antananarivo, Indian/Chagos, Indian/Christmas, Indian/Cocos,
Indian/Comoro, Indian/Kerguelen, Indian/Mahe, Indian/Maldives, Indian/Mauritius, Indian/Mayotte,
Indian/Reunion, Pacific/Apia, Pacific/Auckland, Pacific/Chatham, Pacific/Easter, Pacific/Efate,
Pacific/Enderbury, Pacific/Fakaofo, Pacific/Fiji, Pacific/Funafuti, Pacific/Galapagos,
Pacific/Gambier, Pacific/Guadalcanal, Pacific/Guam, Pacific/Honolulu, Pacific/Johnston,
Pacific/Kiritimati, Pacific/Kosrae, Pacific/Kwajalein, Pacific/Majuro, Pacific/Marquesas,
Pacific/Midway, Pacific/Nauru, Pacific/Niue, Pacific/Norfolk, Pacific/Noumea, Pacific/Pago_Pago,
Pacific/Palau, Pacific/Pitcairn, Pacific/Ponape, Pacific/Port_Moresby, Pacific/Rarotonga,
Pacific/Saipan, Pacific/Tahiti, Pacific/Tarawa, Pacific/Tongatapu, Pacific/Truk, Pacific/Wake,
Pacific/Wallis, Pacific/Yap</span></span></td>
</tr>
</table>
<h2> </h2>
<h2><span><span>Appendix L: <a name="Canonical_Form">Canonical Form</a></span></span></h2>
<p><span>The following are restrictions on the format of LDML files to allow for easier parsing
and comparison of files. </p>
<p>Peer elements have consistent order. That is, if the DTD or this specification requires the
following order in an element foo:</span></p>
<pre><span><foo>
<pattern>
<somethingElse>
</foo></span></pre>
<p><span>It can never require the reverse order in a different element bar.</span></p>
<pre><span><foo>
<somethingElse>
<pattern>
</foo></span></pre>
<p><span>Note that there was one case that had to be corrected in order to make this true. For
that reason, pattern occurs twice under currency:</span></p>
<pre><span class="dtd"><!ELEMENT currency (alias | (pattern*, displayName?, symbol?, pattern*,
decimal?, group?, special*)) ></span></pre>
<p><span><a href="http://www.w3.org/TR/REC-xml/">XML</a> files can have a wide variation in
textual form, while representing precisely the same data. By putting the LDML files in the
repository into a canonical form, this allows us to use the simple diff tools used widely (and in
CVS) to detect differences when vetting changes, without those tools being confused. This is not a
requirement on other uses of LDML; just simply a way to manage repository data more easily.</span></p>
<h2><span>Content</span></h2>
<ol>
<li><span>All start elements are on their own line, indented by <i>depth</i> tabs.</span></li>
<li><span>All end elements (except for leaf nodes) are on their own line, indented by <i>depth</i>
tabs. </span></li>
<li><span>Any leaf node with empty content is in the form <foo/>.</span></li>
<li><span>There are no blank lines except within comments or content.</span></li>
<li><span>Spaces are used within a start element. There are no extra spaces within elements.</span><ul>
<li><span><code><version number="1.2"/></code>, not <code><version number = "1.2" /></code></span></li>
<li><span><code></identity></code>, not <code></identity ></code></span></li>
</ul>
</li>
<li><span>All attribute values use double quote ("), not single (').</span></li>
<li><span>There are no CDATA sections, and no escapes except those absolutely required.</span><ul>
<li><span>no &apos; since it is not necessary</span></li>
<li><span>no '&#x61;', it would be just 'a'</span></li>
</ul>
</li>
<li><span>All attributes with defaulted values are suppressed. See the
<a href="http://www.unicode.org/cldr/data/docs/design/ldml_canonical_form.html#Defaulted_Values_Table">
Defaulted Attributes Table</a></span></li>
<li><span>The draft and alt="proposed.*" attributes are only on leaf elements.</span></li>
<li><span>The tzid are canonicalized in the following way:</span><ol>
<li type="a"><span>All tzids as of as CLDR 1.1 (2004.06.08) in zone.tab are canonical.</span></li>
<li><span>After that point, the first time a tzid is introduced, that is the canonical form.</span></li>
</ol>
<p><span>That is, new IDs are added, but existing ones keep the original form. The </span><i>
<span class="changedspan">TZ</span></i><span> timezone database keeps a set of equivalences in
the "backward" file. These are used to map other tzids to the canonical form. For example, when
<code>America/Argentina/Catamarca</code> was introduced as the new name for the previous <code>
America/Catamarca</code>, a link was added in the backward file. </p>
<p><code>Link America/Argentina/Catamarca America/Catamarca</code></span></li>
</ol>
<p><span><i>Example:</i></span></p>
<pre><span><ldml draft="<span class="changedspan">unconfirmed</span>" >
<identity>
<version number="1.2"/>
<generation date="2004-06-04"/>
<language type="en"/>
<territory type="AS"/>
</identity>
<numbers>
<currencyFormats>
<currencyFormatLength>
<currencyFormat>
<pattern>¤#,##0.00;(¤#,##0.00)</pattern>
</currencyFormat>
</currencyFormatLength>
</currencyFormats>
</numbers>
</ldml></span></pre>
<h2><span>Ordering</span></h2>
<ol>
<li><span>Element names are ordered by the
<a href="http://www.unicode.org/cldr/data/docs/design/ldml_canonical_form.html#Element_Order_Table">
Element Order Table</a></span></li>
<li><span>Attribute names are ordered by the
<a href="http://www.unicode.org/cldr/data/docs/design/ldml_canonical_form.html#Attribute_Order_Table">
Attribute Order Table</a></span></li>
<li><span>Attribute value comparison is a bit more complicated, and may depend on the attribute
and type. Compare two values by using the following steps:</span><ol>
<li><span>If two values are in the
<a href="http://www.unicode.org/cldr/data/docs/design/ldml_canonical_form.html#Value_Order_Table">
Value Order Table</a>, compare according to the order in the table. Otherwise if just one is,
it goes first.</span></li>
<li><span>If two values are numeric [0-9], compare numerically (2 < 12). Otherwise if just one
is numeric, it goes first.</span></li>
<li><span>Otherwise values are ordered alphabetically</span></li>
</ol>
</li>
<li><span>An attribute-value pair is ordered first by attribute name, and then if the attribute
names are identical, by the value.</span></li>
<li><span>An element is ordered first by the element name, and then if the element names are
identical, by the sorted set of attribute-value pairs (sorted by #4). For the latter, compare
the first pair in each (in sorted order by attribute pair). If not identical, go to the second
pair, etc.</span></li>
<li><span>Any future additions to the DTD must be structured so as to allow compatibility with
this ordering.</span></li>
<li><span>See also Appendix K:
<a href="http://www.unicode.org/reports/tr35/#valid_attribute_values">Valid Attribute Values</a></span></li>
</ol>
<h2><span>Comments</span></h2>
<ol>
<li><span>Comments are of the form <!-- <i>stuff</i> -->.</span></li>
<li><span>They are logically attached to a node. There are 4 kinds:</span><ol>
<li><span>Inline always appear after a leaf node, on the same line at the end. These are a
single line.</span></li>
<li><span>Preblock comments always precede the attachment node, and are indented on the same
level.</span></li>
<li><span>Postblock comments always follow the attachment node, and are indented on the same
level.</span></li>
<li><span>Final comment, after </ldml></span></li>
</ol>
</li>
<li><span>Multiline comments (except the final comment) have each line after the first indented
to one deeper level.</span></li>
</ol>
<p><span><b>Examples:</b></span></p>
<pre><span><eraAbbr>
<era type="0">BC</era> <!-- might add alternate BDE in the future -->
...
<timeZoneNames>
<!-- Note: zones that don't use daylight time need further work -->
<zone type="America/Los_Angeles">
...
<!-- Note: the following is known to be sparse,
and needs to be improved in the future -->
<zone type="Asia/Jerusalem"></span></pre>
<h2><span><b>Canonicalization</b></span></h2>
<p><span>The process of canonicalization is fairly straightforward, except for comments. Inline
comments will have any linebreaks replaced by a space. There may be cases where the attachment
node is not permitted, such as the following.</span></p>
<pre><span> </dayWidth>
<!-- some comment -->
</dayContext>
</days></span></pre>
<p><span>In those cases, the comment will be made into a block comment on the last previous leaf
node, if it is at that level or deeper. (If there is one already, it will be appended, with a
line-break between.) If there is no place to attach the node (for example, as a result of
processing that removes the attachment node), the comment and its node's xpath will be appended to
the final comment in the document.</span></p>
<p><span>Multiline comments will have leading tabs stripped, so any indentation should be done
with spaces.</span></p>
<hr>
<h3><span><a name="Element_Order_Table">Element Order Table</a></span></h3>
<p><span class="changed">The order of attributes is given by the elementOrder
table in the supplemental metadata.</span></p>
<p><span class="removedspan"><span>The organization into bullets is purely for clarity; the ordering is established by which
comes first in the overall list. Note that most combinations of pairs of items will never be peer
elements, and thus never be compared.</span></span></p>
<ul>
<li><span class="removedspan"><span>ldml, identity, alias, localeDisplayNames, layout, characters, delimiters,
measurement, dates, numbers, collations, posix,</span></span></li>
<li><span class="removedspan"><span>version, generation, language, script, territory, variant,</span></span></li>
<li><span class="removedspan"><span>languages, scripts, territories, variants, keys, types,</span></span></li>
<li><span class="removedspan"><span>key, type,</span></span></li>
<li><span class="removedspan"><span>orientation, exemplarCharacters, mapping, cp,</span></span></li>
<li><span class="removedspan"><span>quotationStart, quotationEnd, alternateQuotationStart, alternateQuotationEnd,</span></span></li>
<li><span class="removedspan"><span>measurementSystem, paperSize, height, width,</span></span></li>
<li><span class="removedspan"><span>localizedPatternChars, calendars, timeZoneNames,</span></span></li>
<li><span><span class="removedspan">months, monthNames, monthAbbr, days, dayNames, dayAbbr, week, am, pm, eras,
dateFormats, timeFormats, dateTimeFormats, fields, month, day, minDays, firstDay, weekendStart,
weekendEnd, eraNames, eraAbbr, eraNarrow, era, pattern, displayName, hourFormat, hoursFormat, gmtFormat,
regionFormat, fallbackFormat, abbreviationFallback, preferenceOrdering, default, calendar,
monthContext, monthWidth, dayContext, dayWidth, dateFormatLength, dateFormat, timeFormatLength,
timeFormat, dateTimeFormatLength, dateTimeFormat, zone, long, short, exemplarCity, generic,
standard, daylight, field, relative,</span></span></li>
<li><span class="removedspan"><span>symbols, decimalFormats, scientificFormats, percentFormats, currencyFormats,
currencies,</span></span></li>
<li><span class="removedspan"><span>decimalFormatLength, decimalFormat, scientificFormatLength, scientificFormat,
percentFormatLength, percentFormat, currencyFormatLength, currencyFormat, currency, symbol,
decimal, group, list, percentSign, nativeZeroDigit, patternDigit, plusSign, minusSign,
exponential, perMille, infinity, nan,</span></span></li>
<li><span class="removedspan"><span>collation,</span></span></li>
<li><span class="removedspan"><span>messages, yesstr, nostr, yesexpr, noexpr,</span></span></li>
<li><span class="removedspan"><span>special<i> (always last)</i></span></span></li>
</ul>
<h3><span><a name="Attribute_Order_Table">Attribute Order Table</a></span></h3>
<p><span class="changed">The order of attributes is given by the
attributeOrder table in the supplemental metadata.</span></p>
<p><span class="removedspan"><span>The organization into bullets is purely for clarity; the ordering is established by which
comes first in the overall list. Note that most combinations of pairs of items will never be peer
elements, and thus never be compared.</span></span></p>
<ul>
<li><span class="removedspan"><span>type, key, registry, alt <i>(distinguishing types)</i></span></span></li>
<li><span class="removedspan"><span>source, path,</span></span></li>
<li><span class="removedspan"><span>day, date,</span></span></li>
<li><span class="removedspan"><span>version, count,</span></span></li>
<li><span class="removedspan"><span>lines, characters,</span></span></li>
<li><span class="removedspan"><span>before,</span></span></li>
<li><span class="removedspan"><span>number, time,</span></span></li>
<li><span class="removedspan"><span>validSubLocales, standard, references,</span></span></li>
<li><span class="removedspan"><span>draft</span></span></li>
</ul>
<h3><span><a name="Value_Order_Table">Value Order Table</a></span></h3>
<p><span class="changed">The order of attribute values is given by the order
of the values in the attributeValues elements that have the attibute
order="given". Numeric values are sorted in numeric order, while </span>
<span class="changed">tzids are ordered by country, then </span><span class="changedspan">
longitude, then latitude.</span></p>
<blockquote>
<table border="1" cellspacing="0">
<tr>
<td><span class="removedspan"><span>weekendStart</span></span></td>
<td rowspan="2"><span class="removedspan"><span>day</span></span></td>
<td rowspan="3"><span class="removedspan"><span>sun, mon, tue, wed, thu, fri, sat</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>weekendEnd</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>day</span></span></td>
<td rowspan="12"><span class="removedspan"><span>type</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dateFormatLength </span></span></td>
<td rowspan="7"><span class="removedspan"><span>full, long, medium, short</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>timeFormatLength </span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dateTimeFormatLength </span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>decimalFormatLength </span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>scientificFormatLength </span>
</span></td>
</tr>
<tr>
<td><span class="removedspan"><span>percentFormatLength </span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>currencyFormatLength </span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>monthWidth </span></span></td>
<td rowspan="2"><span class="removedspan"><span>wide, abbreviated, narrow</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>dayWidth </span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>field</span></span></td>
<td><span class="removedspan"><span>era, year, month, week, day, weekday, dayperiod, hour, minute, second, zone</span></span></td>
</tr>
<tr>
<td><span class="removedspan"><span>zone</span></span></td>
<td><span class="removedspan"><span><i>The order for prefixes are: </i>America, Atlantic, Europe, Africa, Asia,
Indian, Australia, Pacific, Arctic, Antarctica, Etc. <i>Within the same prefix, sort first
by longitude, then latitude (both given by the zone.tab file in the </i></span>
</span><i>
<span class="removedspan">TZ</span></i><span class="removedspan"><span><i> database), then by full tzid.</i></span></span></td>
</tr>
<tr>
<td colspan="3"><span class="removedspan"><span><i>numeric order</i></span></span></td>
</tr>
<tr>
<td colspan="3"><span class="removedspan"><span><i>alphabetic order</i></span></span></td>
</tr>
</table>
</blockquote>
<h3><span><a name="Defaulted_Values_Table">Defaulted Values Table</a></span></h3>
<p><span class="changed">The defaulted attributes are given by the <i>
suppress</i> table in the supplemental metadata. There is one special value
_q; that is used on serial elements internally to preserve ordering.</span></p>
<blockquote>
<table border="1" cellpadding="2" cellspacing="0">
<tr>
<td width="33%"><span class="removedspan"><span>ldml </span></span></td>
<td width="33%"><span class="removedspan"><span>version </span></span></td>
<td width="33%"><span class="removedspan"><span>"1.2"</span></span></td>
</tr>
<tr>
<td rowspan="2" width="33%"><span class="removedspan"><span><i>orientation </i></span>
</span></td>
<td width="33%"><span class="removedspan"><span><i>characters </i></span>
</span></td>
<td width="33%"><span class="removedspan"><span><i>"left-to-right"</i></span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span><i>lines </i></span>
</span></td>
<td width="33%"><span class="removedspan"><span><i>"top-to-bottom"</i></span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>weekendStart </span>
</span></td>
<td rowspan="2" width="33%"><span class="removedspan"><span>time </span>
</span></td>
<td width="33%"><span class="removedspan"><span>"00:00"</span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>weekendEnd </span>
</span></td>
<td width="33%"><span class="removedspan"><span>"24:00"</span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>dateFormat </span>
</span></td>
<td rowspan="10" width="33%"><span class="removedspan"><span>type</span></span></td>
<td rowspan="10" width="33%"><span class="removedspan"><span>"standard"</span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>timeFormat </span>
</span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>dateTimeFormat </span>
</span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>decimalFormat </span>
</span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>scientificFormat </span>
</span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>percentFormat </span>
</span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>currencyFormat </span>
</span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>pattern </span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>currency </span></span></td>
</tr>
<tr>
<td width="33%"><span class="removedspan"><span>collation</span></span></td>
</tr>
</table>
</blockquote>
<h2><a name="Coverage_Levels">Appendix M: Coverage Levels</a></h2>
<p><span class="changedspan">The following defines the coverage levels:</span></p>
<table border="1" cellpadding="0" cellspacing="1" style="border-collapse: collapse; margin-top: 0.5li; margin-bottom: 0.5li" bordercolor="#111111" id="AutoNumber6">
<tr>
<th nowrap><span class="changedspan">Level</span></th>
<th colspan="2"><span class="changedspan">Description</span></th>
</tr>
<tr>
<td nowrap><span class="changedspan">level 100</span></td>
<td><span class="changedspan">comprehensive</span></td>
<td><span class="changedspan">Has complete localizations (or valid inheritance) for every
possible field</span></td>
</tr>
<tr>
<td nowrap><span class="changedspan">level 80</span></td>
<td><span class="changedspan">modern</span></td>
<td rowspan="3"><span class="changedspan">Localizations (or valid inheritance) as given below</span></td>
</tr>
<tr>
<td nowrap><span class="changedspan">level 60</span></td>
<td><span class="changedspan">moderate</span></td>
</tr>
<tr>
<td nowrap><span class="changedspan">level 40</span></td>
<td><span class="changedspan">basic</span></td>
</tr>
<tr>
<td nowrap><span class="changedspan">level 20</span></td>
<td><span class="changedspan">posix</span></td>
<td><span class="changedspan">Only what is required for POSIX generation; example, only one
country name, only one currency symbol, etc.</span></td>
</tr>
<tr>
<td nowrap><span class="changedspan">level 0</span></td>
<td><span class="changedspan">rudimentary</span></td>
<td><span class="changedspan">Doesn't meet any of the above levels. (default, if nothing
specified) </span></td>
</tr>
</table>
<p><span class="changedspan">Levels 40 and 60 are based on the following definitions and
specifications.</span></p>
<h3><span class="changedspan">Definitions</span></h3>
<ul>
<li><span class="changedspan"><i>Target-Language</i> is the language under consideration. </span>
</li>
<li><span class="changedspan"><i>Target-Territories</i> is the list of territories found by
looking up <i>Target-Language</i> in the <languageData> elements in
<a href="http://unicode.org/cldr/data/common/main/supplementalData.xml">supplementalData.xml</a>
</span></li>
<li><span class="changedspan"><i>Language-List</i> is <i>Target-Language</i>, plus </span>
<ul>
<li><span class="changedspan"><b>basic: </b>Chinese, English, French, German, Italian,
Japanese, Portuguese, Russian, Spanish (de, en, es, fr, it, ja, pt, ru, zh)</span></li>
<li><span class="changedspan"><b>moderate: </b>basic + Arabic, Hindi, Korean, Indonesian,
Dutch, Bengali, Turkish, Thai, Polish (ar, hi, ko, in, nl, bn, tr, th, pl). If an EU language,
add the remaining official EU languages, currently: Danish, Greek, Finnish, Swedish, Czech,
Estonian, Latvian, Lithuanian, Hungarian, Maltese, Slovak, Slovene (da, el, fi, sv, cs, et, lv,
lt, hu, mt, sk, sl)</span></li>
<li><span class="changedspan"><b>modern:</b> all languages that are official or major
commercial languages of modern territories</span></li>
</ul>
</li>
<li><span class="changedspan"><i>Target-Scripts </i>is the list of scripts in which <i>
Target-Language</i> can be customarily written (found by looking up <i>Target-Language</i> in
the <languageData> elements in
<a href="http://unicode.org/cldr/data/common/main/supplementalData.xml">supplementalData.xml</a>)<i>.</i></span></li>
<li><span class="changedspan"><i>Script-List</i> is the <i>Target-Scripts</i> plus the major
scripts used for multiple languages</span><ul>
<li><span class="changedspan">Latin, Simplified Chinese, Traditional Chinese, Cyrillic, Arabic
(Latn, Hans, Hant, Cyrl, Arab)</span></li>
</ul>
</li>
<li><span class="changedspan"><i>Territory-List</i> is the list of territories formed by taking
the <i>Target-Territories</i> and adding: </span>
<ul>
<li><span class="changedspan"><b>basic: </b>Brazil, China, France, Germany, India, Italy,
Japan, Russia, United Kingdom, United States (BR, CN, DE, GB, FR, IN, IT, JP, RU, US)</span></li>
<li><span class="changedspan"><b>moderate: </b>basic + Spain, Canada, Korea, Mexico,
Australia, Netherlands, Switzerland, Belgium, Sweden, Turkey, Austria, Indonesia, Saudi
Arabia, Norway, Denmark, Poland, South Africa, Greece, Finland, Ireland, Portugal, Thailand,
Hong Kong SAR China, Taiwan (ES, BE, SE, TR, AT, ID, S