CLDR Ticket #8257(accepted charts)
Improved "findability" of cldr
|Reported by:||mark||Owned by:||mark|
From a thread started by Richard:
Last time I did this, Mark said it was useful. So i'm adding insult to injury by taking some extra time out of my weekend to document how i managed to waste a fair bit of time trying to find information in the CLDR maze.
I'd like to see a list of case conversion tailorings, just to check whether there are one or two i don't yet know about.
The Unicode Standard tells me to look in CLDR for case conversion tailoring information, so i head to cldr.unicode.org
Sure enough, under "Language & Script Information" i see 'capitalization' and so click on the "Language & Script Information" link.
I'm taken to http://cldr.unicode.org/cldr-features#TOC-Language-and-script-information: This seems to be the same information, just in list form - ah, but there are links to subitems...
Oh, no link for capitalization :( Stumped.
I notice a link to CLDR Charts in the nav column and decide to try to wrestle with the charts. I click on the link.
I arrive at http://cldr.unicode.org/index/charts. The 'By-type' heading looks promising, but there are no links there :(
I click on the link to http://www.unicode.org/cldr/charts/26/
Here again is By-Type. I click on it, and am taken to http://www.unicode.org/cldr/charts/26/by_type/index.html
There's a long list of links. I look through it. Right at the bottom I see Transforms. I haven't seen anything else that looks like it would lead to case information, so i click on it.
There's no 'On this page' set of links, so i start scrolling... I get to the end, but no joy.
I try the link at the top to "Linguistic Elements". I scroll through that page. No joy.
One last try? I click on Alphabetic Information. This is a long page, but after scrolling to the bottom, while looking out for headings, I still draw a blank.
I'll go back to the home page and try all over again. Hmm, i get sent back to the Charts page from the link at the top of the page, and it takes some abortive clicks to realise that i have to click the top right text on the page again to get to the home page. ... cldr.unicode.org *is* the home page and the best place to start from, right?...
Now I'm really stumped. (I actually tried following the path again, but got no better results.) I gave up. (Muttering and growling to myself about what a waste of time it is trying to find things in CLDR, and suspecting that I've been led on a wild goose chase from the start by the standard, and that there really isn't anything about case conversion in CLDR after all. At least, if there is, I don't really have time to look for it any more, especially if there's a risk of hitting more blank walls.)
I think you might be right. Whenever I'm stumped on where to look in CLDR, I go to TR35 and search for what the markup is. That can also be an adventure... and I didn't find case folding there this morning. Ditto my next strategy, which is to search likely keywords in the charts. Googling the topic turned up nothing useful. I suspect what you want is still SpecialCasing.txt?? For sure if it is in CLDR, I can't find it.
Richard, the data is in transforms, as you surmised. But we don't show all the transforms in the charts. The actual data shows them:
Part of the problem is that we don't have a set of charts for all of the data in CLDR. But even if we did, there is the discovery issue.
We have bugs filed for creating more charts, but I'm wondering in the meantime if we can make some incremental improvements. What I was thinking about is
- fleshing out http://www.unicode.org/cldr/charts/latest/ to provide more information on where to find what.
- for each topic area (casing, etc), provide not only a pointer to charts for it, but also (especially if there is no chart) to the XML data.
- add a section on cldr.org page to highlight that as a place to go to for finding what's in CLDR.
What do you think?
- Status changed from new to assigned
- Component changed from unknown to charts
- Priority changed from assess to medium
- Phase changed from dsub to final
- Milestone changed from UNSCH to 28
- Owner changed from anybody to mark