[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #11339(new)

Opened 6 months ago

Last modified 3 months ago

Punctuation needs categorization like letters (in Core data / Alphabetic information)

Reported by: Marcel Schneider <charupdate@…> Owned by: tbishop
Component: surveytool-UI Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:


Actually there is only one item for punctuation, encompassing both publishing (using typographic punctuation) and draft (using ASCII fallbacks).

Further there is no dedicated place where we could account for compatibility punctuation like U+2010 and U+2015 (to date mainly used when attempting to be Unicode-conformant with respect to specifications designed mainly in disconnect with practice, and only later filled with sense (wrongly for U+2010, accurately for U+2015).

Additionally we should also have ornamental punctuation, which in part is found in the General Punctuation block (such as the reversed quotation marks, sometimes used in DTP and web design, in oversize to highlight a quotation), but mostly in special ornamental punctuation ranges (U+2753.., U+2761.., U+1F676..).

CLDR should yield all that on a per-locale and per-category subset basis. Instead, in fr-FR we still have sort of a random subset mixed up of various signs, that does not respond to any of the above use cases.


Given punctuation is part of writing systems rather than of alphabets, CLDR should redesign the section, replacing "Alphabetic Information" with "Writing System" (since there are as many sublocales as writing systems used in a locale), term encompassing the more pragmatic label "Characters In Use" found as only subsection header ("in" as a stop word is, usually not titlecased), then having a first subsection called "Alphabetic Information" with the "Main Letters", "Auxiliary" (reworded) and "Index" (reworded) lines.

Then the actual "Others: numbers" should probably be moved to the next section "Numbering systems", which actually has only one subsection ("Numbering system", should be plural) with two lines.

The second subsection in "Writing system" would then be "Punctuation", with five lines: "Main punctuation" (publishing style), "Auxiliary", "Draft style", "Compatibility", and "Ornamental".

Example of punctuation

For example, in fr-FR, we can have the following punctuation marks:

  • Main punctuation: ! \& ( ) * , \- . / \: ; ? \[ \] § « » U+2011 U+2013 U+2014 ’ “ ” † ‡ … ‹ ›
  • Auxiliary: # @ _ ‘
  • Draft style: " '
  • Compatibility: U+2010 U+2012 U+2015
  • Ornamental: U+201B, U+201F, U+2753..U+2757, U+2762, U+2763, U+1F676..U+1F67C


Change History

comment:1 Changed 4 months ago by mark

  • Component changed from main to other

comment:2 Changed 3 months ago by mark

  • Milestone changed from UNSCH to to-assess

comment:3 Changed 3 months ago by mark

  • Owner changed from anybody to tbishop
  • Component changed from other to survey-frontend

Add a comment

Modify Ticket

as new

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.