[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #11072(closed: fixed)

Opened 12 months ago

Last modified 7 months ago

supplementalData.xml has a line containing 7000+ characters, and this breaks the parser.

Reported by: adam.farley@… Owned by: emmons
Component: other-supplemental Data Locale: https://www.unicode.org/repos/cldr/trunk/common/supplemental/supplementalData.xml
Phase: rc Review: srl
Weeks: Data Xpath:


Line 5204 in the below file is so long that vi on z/OS cannot handle it, nor can the java xml parser parse that file on z/OS.


Would it be possible to reduce the length of the line to 2000 characters or less, somehow?


Change History

comment:1 Changed 12 months ago by srl

I measure the line as 9,505 bytes

<territoryCodes type="ZZ" numeric="999" alpha3="ZZZ" internet="AAA AARP ABARTH ABB ABBOTT…

comment:2 Changed 9 months ago by srl

Can we just get rid of the internet= attribute on this line? What benefit does it give to CLDR?

The list of TLD data is readily available at https://www.icann.org/resources/pages/tlds-2012-02-25-en - I don't think this belongs in CLDR.

the CLDR copy isn't even up to date. I'm OK with keeping tlds-alpha-by-domain.txt in the CLDR tools' data if it gets used for something (exemplar checks or whatever)

Last edited 9 months ago by srl (previous) (diff)

comment:3 Changed 8 months ago by adam.farley@…

Turns out this line is only a problem on z/OS if you open the file in vi (as the line gets clipped/broken down via line breaks), which then gets worse if you save it.

Edit line 1 and line 5204 changes. Madness.

If we can delete this attribute the problem goes away anyway. Echoing srl: is it needed for anything?

comment:4 Changed 8 months ago by mark

Deprecate and remove internet="..."
Remove from tool.

comment:5 Changed 8 months ago by emmons

  • Status changed from new to accepted
  • Priority changed from assess to medium
  • Phase changed from dsub to final
  • Milestone changed from UNSCH to 34
  • Owner changed from anybody to emmons
  • type changed from data to tools

comment:6 Changed 8 months ago by adam.farley@…

Thanks folks. :)

comment:7 Changed 8 months ago by emmons

  • Phase changed from final to rc
  • Status changed from accepted to reviewing
  • Review set to srl

comment:8 Changed 7 months ago by srl

  • Status changed from reviewing to closed
  • Resolution set to fixed

I guess I just meant to remove the ZZ Intenet line… but this works also, and the same argument applies- ccTLD data isn't germane to CLDR any more than gTLD data is.


Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.