[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #4637(closed defect: fixed)

Opened 6 years ago

Last modified 5 years ago

Coverage structural improvements

Reported by: mark Owned by: emmons
Component: main Data Locale:
Phase: Review: pedberg
Weeks: Data Xpath:
Xref:

Description

Every separate match against a regex takes some time. Example:

		<coverageLevel value="100" match="localeDisplayNames/scripts/script[@type='(Armi|Avst|Bali|Bamu|Batk|Blis|Brah|Buhd|Cakm|Cans|Cari|Cham|Cher|Cirt|Copt|Cprt|Cyrs|Dsrt|Egy[dhp]|Geok|Glag|Goth|Gran)']"/>
		<coverageLevel value="100" match="localeDisplayNames/scripts/script[@type='(Hano|Hmng|Hrkt|Hung|Inds|Ital|Java|Kali|Khar|Khoj|Kthi|Lana|Lat[fg]|Lepc|Limb|Lin[ab]|Lisu|Ly[cd]i|Man[di]|Maya|Mer[co])']"/>

Complex expressions can be hard to make out, and harder to maintain. Example

		<coverageLevel inTerritory="(AE|AF|BH|DJ|DZ|EG|EH|ER|IL|IQ|IR|JO|KM|KW|LB|LY|MA|MR|OM|PS|QA|SA|SD|SY|TN|YE)" value="40" match="dates/calendars/calendar[@type='islamic']/months/monthContext[@type='(format|stand-alone)']/monthWidth[@type='(wide|abbreviated|narrow)']/month[@type='[^']++']"/>
		<coverageLevel inTerritory="(AE|AF|BH|DJ|DZ|EG|EH|ER|IL|IQ|IR|JO|KM|KW|LB|LY|MA|MR|OM|PS|QA|SA|SD|SY|TN|YE)" value="40" match="dates/calendars/calendar[@type='islamic']/eras/eraAbbr/era[@type='0']"/>

We can improve both of these by using variables, which are supported by RegexLookup.

We could structure this as:

<variable key="%script100" value ="Armi|Avst|Bali|Bamu|Batk|Blis|Brah|Buhd|Cakm|Cans|Cari|Cham|Cher|Cirt|Copt|Cprt|...
<variable key="%terrIslamic" value="AE|AF|BH|DJ|DZ|EG|EH|ER|IL|IQ|IR|JO|KM|KW|LB|LY|MA|MR|OM|PS|QA|SA|SD|SY|TN|YE"/>
...

<coverageLevel value="100" match="localeDisplayNames/scripts/script[@type='(%script100)']"/>

<coverageLevel inTerritory="(%terrIslamic)" value="40" match="dates/calendars/calendar[@type='islamic']/months/monthContext[@type='(format|stand-alone)']/monthWidth[@type='(wide|abbreviated|narrow)']/month[@type='[^']++']"/>

// we can also do %allwidths = wide|abbreviated|narrow and so on for readability

Attachments

Change History

comment:1 Changed 6 years ago by mark

  • Owner changed from anybody to emmons
  • Priority changed from assess to medium
  • Status changed from new to assigned
  • Component changed from unknown to data
  • Milestone changed from UNSCH to 22

comment:2 Changed 6 years ago by emmons

  • Status changed from assigned to accepted
  • Review set to mark

comment:3 Changed 5 years ago by mark

  • Review changed from mark to fredrik

comment:4 Changed 5 years ago by pedberg

  • Status changed from accepted to closed
  • Resolution set to fixed
  • Review changed from fredrik to pedberg
View

Add a comment

Modify Ticket

Action
as closed
Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.