[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #3976(closed defect: fixed)

Opened 7 years ago

Last modified 6 years ago

Add data/spec for transform specification in language tags

Reported by: mark Owned by: mark
Component: main Data Locale:
Phase: Review: yoshito
Weeks: 0.4 Data Xpath:




Description (last modified by mark) (diff)

   The subtags in the 't' extension are of the following form:

     | Label  | ABNF                    | Comment                    |
     | t_ext= | "t-"                    | Extension                  |
     |        | [lang]                  | Source                     |
     |        | *("-" field)            | Optional information       |
     | lang=  | language                | [BCP47], with restrictions |
     |        | ["-" script]            |                            |
     |        | ["-" region]            |                            |
     |        | *("-" variant)          |                            |
     | field= | sep 1*("-" 3*8alphanum) | With restrictions          |
     | sep=   | 1ALPHA 1DIGIT           | Subtag separators          |

   Description and restrictions:

   a.  The 't' extension MUST have at least one subtag.

   b.  The 't' extension normally starts with a source language tag,
       which MUST be a regular, canonical language tag as specified by
       [BCP47].  Tags described by the 'irregular' production in BCP 47
       MUST NOT be used to form the language tag.  The source language

Davis, et al.           Expires December 18, 2011               [Page 6]
Internet-Draft         BCP 47 Transform Extension              June 2011

       tag MAY be omitted: some field values do not require it.

   c.  There is optionally a sequence of fields, where each field is a
       separator followed by a sequence of subtags.  Two identical
       separators MUST NOT be present.

   d.  One field is initially specified in [UTS35]: the transform

       A.  The transform mechanism consists of a sequence of subtags
           starting with the 'm0' separator followed by one or more
           mechanism subtags.  Each mechanism subtag has a length of 3
           to 8 alphanumeric characters.  The sequence as a whole
           provides an identification of the specification for the
           transform, such as the mechanism subtag 'UNGEGN' in "und-
           Cyrl-t-und-latn-m0-ungegn".  In many cases, only one
           mechanism subtag is necessary, but multiple subtags MAY be
           defined in [UTS35] where necessary.

       B.  Any purely numeric subtag is a representation of a date in
           the Gregorian calendar.  It MAY occur in any mechanism field.
           If it does occur:

           +  it MUST occur as the final subtag in the field,

           +  it MUST NOT be the only subtag in the field, and

           +  it MUST consist of a sequence of digits of the form YYYY,
              YYYYMM, or YYYYMMDD.

           For example, 20110623 represents June 23th, 2011.  A date
           subtag SHOULD only be used where necessary, and then SHOULD
           be as short as possible.  For example, suppose that the BGN
           transliteration specification for Cyrillic to Latin had three
           versions, dated June 11th, 1999; Dec 30th, 1999; and May 1st,
           2011.  In that case, the corresponding first two DATE subtags
           would require months to be distinctive (199906 and 199912),
           but the last subtag would only require the year (2011).

       C.  Some mechanisms may use a versioning system that is not
           distinguished by date, or not by date alone.  In the latter
           case, the version will be of a form specified by [UTS35] for
           that mechanism.  For example, if the mechanism XXX uses
           versions of the form v21a, then a tag could look like "ja-t-
           it-m0-xxx-v21a".  If there are multiple subversions
           distinguished by date, then a tag could look like "ja-t-it-

[updated according to first draft]

Will need to add to spec, and extend DTD for bcp47 to indicate difference between 'u' data and 't' data.


Change History

comment:1 Changed 7 years ago by mark

  • Description modified (diff)

comment:2 Changed 7 years ago by mark

  • Owner changed from somebody to mark
  • Priority changed from assess to major
  • Status changed from new to assigned
  • Component changed from unknown to data
  • Milestone changed from UNSCH to 2.0.1

Approved by committee.

comment:3 Changed 7 years ago by srl

The 'ts' seems to correspond to what is defined in the DTD as 'variant': <transform variant="BGN" ... >

comment:4 Changed 7 years ago by mark

  • Description modified (diff)
  • Summary changed from Add key-type for transliteration specification to Add data/spec for transform specification in language tags

comment:5 Changed 7 years ago by mark

  • Milestone changed from 2.0.1 to soon

This is undergoing review in IETF so premature to have in 2.0.1

comment:6 Changed 7 years ago by mark

  • Xref set to 3398, 3012, 3976

comment:7 Changed 6 years ago by mark

  • Priority changed from major to critical
  • Status changed from assigned to accepted
  • Milestone changed from soon to 21

The -t- extension has been approved, and we need to finalize the initial data for the release. It should include everything that we need for backwards compatibility for the transforms.

comment:8 Changed 6 years ago by mark

  • Keywords google added

comment:10 Changed 6 years ago by mark

  • Weeks set to 0.4

comment:11 Changed 6 years ago by mark

  • Review set to yoshito

comment:12 Changed 6 years ago by yoshito

  • Review yoshito deleted

comment:13 Changed 6 years ago by mark

  • Review set to yoshito

comment:14 Changed 6 years ago by yoshito

  • Status changed from accepted to closed
  • Resolution set to fixed

Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.