[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #7395(accepted data)

Opened 3 years ago

Last modified 2 years ago

collation rules: improve formatting

Reported by: markus Owned by: markus
Component: collation Data Locale:
Phase: rc Review:
Weeks: 0.2 Data Xpath:
Xref:

Description

I would like to change the formatting to make collation tailoring rules more readable. In particular, I would like to add LRMs to lines with literal RTL characters, and I would like to reduce or remove indentation so that rules and comments fit better in editors.

For example, in root.xml type=search, change from

					# root search rules for Arabic, Hebrew
					&ا	# 0627 ARABIC LETTER ALEF
							<<<ﺎ<<<ﺍ	# FE8E, FE8D: FINAL FORM, ISOLATED FORM
						<<آ		# 0622 ARABIC LETTER ALEF WITH MADDA ABOVE
					&[last primary ignorable]<<׳	# 05F3 HEBREW PUNCTUATION GERESH

to

# root search rules for Arabic, Hebrew
‎&ا	# 0627 ARABIC LETTER ALEF
‎<<<ﺎ<<<‎ﺍ	# FE8E, FE8D: FINAL FORM, ISOLATED FORM
‎<<آ‎		# 0622 ARABIC LETTER ALEF WITH MADDA ABOVE
‎&[last primary ignorable]<<׳‎	# 05F3 HEBREW PUNCTUATION GERESH

Attachments

Change History

comment:1 Changed 3 years ago by emmons

  • Owner changed from anybody to markus
  • Priority changed from assess to medium
  • Status changed from new to assigned
  • Milestone changed from UNSCH to 26rc

comment:2 follow-up: ↓ 3 Changed 3 years ago by emmons

Need to document (in tr35) that RLM and LRM are to be ignored in the rules.

comment:3 in reply to: ↑ 2 Changed 3 years ago by markus

Replying to emmons:

Need to document (in tr35) that RLM and LRM are to be ignored in the rules.

Rules and patterns in ICU and CLDR have long treated RLM and LRM as "rule white space", and we proposed a Unicode property for that which got accepted as Pattern_White_Space. See http://www.unicode.org/reports/tr31/#Pattern_Syntax

The LDML Collation spec says "Unicode Pattern_White_Space characters between tokens are ignored."

comment:4 Changed 3 years ago by markus

  • Milestone changed from 26rc to 27rc

comment:5 Changed 3 years ago by markus

  • Phase set to rc
  • Milestone changed from 27rc to 27

comment:6 Changed 3 years ago by markus

  • Milestone changed from 27 to 28

comment:7 Changed 3 years ago by markus

  • Type changed from enhancement to data

comment:8 Changed 2 years ago by srl

  • Status changed from assigned to accepted

comment:9 Changed 2 years ago by markus

  • Milestone changed from 28 to UNSCH
View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.