CLDR Ticket #8858(accepted data)
|Reported by:||emmons||Owned by:||emmons|
I did some preliminary analysis, and found out that the CheckWidths test takes the most time of any of CLDR's data tests. Mostly because it is doing a lot of regex lookups that make it inefficient. I recommend the following:
1). Use the STAR_PATTERN_LOOKUP algorithm for lookups in this test, similar to what we do for coverage.
2). Use just a single Limit for each regex instead of an array. There's no place where we're currently using an array with a size > 1.
3). Change the "aliased and comprehensive" check to just a check for narrow units, since doing such a check requires you to look up coverage, which is also slow. The test comments say as follows:
// This was put in specifically to deal with the fact that we added a bunch of new units in CLDR 26 // and didn't put the narrow forms of them into modern coverage. If/when the narrow forms of all units // are modern coverage, then we can safely remove the aliasedAndComprehensive check. Right now if an // item is aliased and coverage is comprehensive, then it can't generate anything worse than a warning.
- Status changed from new to accepted
- Component changed from unknown to perf
- Priority changed from assess to major
- Phase changed from dsub to rc
- Milestone changed from UNSCH to 28
- Owner changed from anybody to emmons
- Type changed from unknown to data