[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #9738(accepted tools)

Opened 21 months ago

Last modified 3 months ago

Speed up RegexLookup

Reported by: mark Owned by: emmons
Component: perf Data Locale:
Phase: rc Review:
Weeks: Data Xpath:


Looking at the RegexTree code, I think it could be much faster.

Currently, it does a regex match on a bunch of prefixes as it descends.

Instead, process each of the items in the input list for the lookup.

For each, get the constant prefix: the longest initial string that doesn't contain any (non-literal) regex syntax or %. For example:


has the constant prefix


You can then build a data structure that uses these prefixes to pre-filter the lookup, avoiding a bunch of regex matches.


Change History

comment:1 Changed 17 months ago by pedberg

  • Status changed from new to accepted
  • Component changed from unknown to other
  • Priority changed from assess to medium
  • Milestone changed from UNSCH to 31
  • Owner changed from anybody to emmons
  • Type changed from unknown to tools

Discuss design with TC before committing

comment:2 Changed 17 months ago by pedberg

  • Cc mark added

comment:3 Changed 16 months ago by emmons

  • Phase changed from dsub to rc

comment:4 Changed 16 months ago by emmons

  • Owner emmons deleted
  • Status changed from accepted to new

Not going to get to this anytime soon.

comment:5 Changed 16 months ago by emmons

  • Owner set to emmons
  • Status changed from new to accepted
  • Milestone changed from 31 to 32

comment:6 Changed 8 months ago by emmons

  • Milestone changed from 32 to UNSCH

comment:7 Changed 3 months ago by mark

  • Component changed from other to perf

Add a comment

Modify Ticket

as accepted

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.