[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #7995(accepted charts)

Opened 3 years ago

Last modified 20 months ago

Navigation problems

Reported by: mark Owned by: mark
Component: charts Data Locale:
Phase: final Review:
Weeks: Data Xpath:
Xref:

Description

Forwarded problems.

I spent so long trying to find and understand the information today that I don't have much time to make this email less blunt and direct, so please don't be offended. I figured i should better report the issues rather than do nothgin.

absolutely.

There is much we can do to improve the navigation, so having particular walkthroughs like this are especially helpful in focusing our efforts.

I wanted to find a list of exemplar characters for tibetan.

--- why doesn't the Navigation section on the left of http://cldr.unicode.org/ point me directly to the latest version of the data? When should I use "CLDR Releases/Downloads" to find the data versus the fourth item in the list "CLDR Charts"? Both seem to show versions.

It sounds like you are interested in the charts, and not the XML.

There are a variety of ways to get to the latest data (v26) charts
at the top of the page cldr.org.
the
"CLDR Releases/Downloads"
through the Charts page.
etc.

Are you saying that we should have fewer ways, or that our jargon makes it hard to see what to click on?

BTW, I tried a google site search to get there
https://www.google.ch/webhp?q=site:http:%2F%2Fwww.unicode.org%2Fcldr+tibetan+characters

That worked better than going to
http://www.unicode.org/search/
, because the generated search on that page had a bunch of cruft.

https://www.google.com/search?q=cldr+tibetan+characters&domains=unicode.org&sitesearch=unicode.org

We might want to have a button on
http://www.unicode.org/search/
for doing a search of cldr alone.

--- at http://www.unicode.org/cldr/charts/26/summary/bo.html i see "Characters in Use" - it took a while for me to check and be confident that this is the same as exemplarCharacters (at least I think it is)

Yes, it is.

That is the "user friendly" name that we show people in the survey tool. Sounds like you just found it confusing.

if i try to check by following the "Other charts and help" link, i see a link at the bottom saying "Charts" (why not "Help"?) and the paragraph that precedes says that this is where I'll find "information about the format and meaning of the charts".

Nope. Nothing about 'characters in use' or 'exemplarCharacters' on that page.

Yes, that is pretty bare-bones.

--- the format of the data for tibetan for the main letters is
[\u0F7E ཿ ཀ {ཀ\u0FB5} \u0F90 {\u0F90\u0FB5} ཁ \u0F91 ག {ག\u0FB7} \u0F92 {\u0F92\u0FB7} ང \u0F94 ཅ \u0F95 ཆ \u0F96 ཇ \u0F97 ཉ \u0F99 ཊ \u0F9A ཋ \u0F9B...

What to the {..} parts mean?

Those are used to indicate that a sequence should be treated as a single string. That is, [abc{de}] is the set containing the strings:

"a", "b", "c", "de"


Let's check for some documentation. Er, no, tried that already :(

Those are all documented in LDML. If you go to there and search for exemplar, you get to a link near the end that takes you to:

http://www.unicode.org/reports/tr35/tr35-general.html#ExemplarSyntax

NOTE: I'm answering your questions here, but it doesn't mean that there isn't a problem with the navigation!

Why is the whole list surrounded by [...] ?

part of the syntax

Why are some characters listed using escapes and others not (and it's not just the combining characters)?

Formally, of course, those are equivalent. We use hex for combining characters, invisible characters, and certain others. I don't recall right off hand. This should be documented, though.

Why doesn't each row link to an explanation of what that row contains, and what the format is?

Simply a matter of programmer time. We actually wanted to get rid of those charts, but enough people said they found them helpful as is that we kept them.


http://www.unicode.org/cldr/charts/26/by_type/index.html

--- Where can i find the XML? That's what I originally wanted anyway, and I still haven't found it.

One simple step we could take is to add a link to the XML right on each page. The access to the xml files is described on http://cldr.unicode.org/index/downloads#latest_draft_version

So, for example, http://www.unicode.org/repos/cldr/tags/release-26/common/main/bo.xml

Eventually, after a good deal of clicking around, i think i found the answer to one of my questions at http://www.unicode.org/reports/tr35/tr35-general.html#ExemplarSyntax, but it was a painful experience, and a huge waste of time to get there.

Can't we improve this?

Definitely. I'll file this as a ticket

(http://unicode.org/cldr/trac/newticket) to capture these problems you had.

Attachments

Change History

comment:1 Changed 3 years ago by mark

It sounds like you are interested in the charts, and not the XML.

Actually i wanted to see both.

There are a variety of ways to get to the latest data (v26) charts
at the top of the page cldr.org <http://cldr.org>.
the
"CLDR Releases/Downloads"

through the Charts page.
etc.

Are you saying that we should have fewer ways, or that our jargon makes
it hard to see what to click on?

The latter. When trying to decide what to click on, it was hard to know
where I'd end up exactly, and what the difference was going to be.

--- at http://www.unicode.org/cldr/__charts/26/summary/bo.html
<http://www.unicode.org/cldr/charts/26/summary/bo.html> i see
"Characters in Use" - it took a while for me to check and be
confident that this is the same as exemplarCharacters (at least I
think it is)

Yes, it is.

That is the "user friendly" name that we show people in the survey tool.
Sounds like you just found it confusing.

The lack of correspondence is what confused me. Just putting
exemplarCharacters in parens would have saved me a lot of time checking.

By the way, there's an issue mentioned in another email i just sent about
the use of the word 'character'. Some of these items are graphemes, not
characters. The distinction is important.

Those are all documented in LDML. If you go to there and search for
exemplar, you get to a link near the end that takes you to:

http://www.unicode.org/reports/tr35/tr35-general.html#ExemplarSyntax

NOTE: I'm answering your questions here, but it doesn't mean that there
isn't a problem with the navigation!

Sure. In fact, that's what I eventually did, but it took a while for me to
find the LDML spec, and then search it and check that I was looking at the
right place, and had all the information i needed in that place.

It would be much better for the i in the circle to link to exactly the right
place, or have multiple i's in circles if you need to look in more than one
place.

Why is the whole list surrounded by [...] ?

part of the syntax

But syntax for what? Why is it in the chart? What benefit does it provide
and when? For me, especially with the rtl data, it was a pain, because i
had to keep stripping them out. I wouldn't mind so much if I understood the
value of having them there in the charts - note that they are not included
in
http://www.unicode.org/cldr/charts/26/by_type/core_data.alphabetic_information.main.html

Why are some characters listed using escapes and others not (and
it's not just the combining characters)?

Formally, of course, those are equivalent. We use hex for combining
characters, invisible characters, and certain others. I don't recall
right off hand. This should be documented, though.

But actually some non-combining characters were listed that way too.

Why doesn't each row link to an explanation of what that row
contains, and what the format is?

Simply a matter of programmer time. We actually wanted to get rid of
those charts, but enough people said they found them helpful as is that
we kept them.

I think that the charts are very useful for people who may want to
contribute to the CLDR effort, not to mention those who want to understand
the CLDR work better so that they can spread the word that people should be
working on it.

I think that communicating the data effectively is as important as
collecting and warehousing it, and should be an equal part of the work,
especially now we are over the initial bash of collecting.

One simple step we could take is to add a link to the XML right on each
page. The access to the xml files is described on
http://cldr.unicode.org/index/downloads#latest_draft_version

So, for example,
http://www.unicode.org/repos/cldr/tags/release-26/common/main/bo.xml

comment:2 Changed 3 years ago by emmons

  • Owner changed from anybody to mark
  • Priority changed from assess to medium
  • Status changed from new to assigned
  • Component changed from unknown to design
  • Milestone changed from UNSCH to 27

comment:3 Changed 3 years ago by emmons

  • Status changed from assigned to design

comment:4 Changed 2 years ago by mark

  • Milestone changed from 27 to 28

comment:5 Changed 2 years ago by mark

  • Component changed from design to docs

comment:6 Changed 2 years ago by mark

  • Phase changed from dsub to rc

comment:7 Changed 2 years ago by markus

  • Type set to docs
  • Component changed from docs to unknown

comment:8 Changed 22 months ago by emmons

  • Component changed from unknown to infrastructure

comment:9 Changed 22 months ago by mark

  • Phase changed from rc to final
  • Status changed from design to accepted
  • Component changed from infrastructure to charts

comment:10 Changed 22 months ago by mark

  • Keywords working added

comment:11 Changed 21 months ago by mark

  • Milestone changed from 28 to 29

comment:12 Changed 21 months ago by mark

  • Type changed from docs to charts
  • Milestone changed from 29 to 28

comment:13 Changed 21 months ago by mark

Some of this has been improved with other tickets.

comment:14 Changed 21 months ago by mark

Also, add to top of each chart (where possible) a pointer to the latest chart.

comment:15 Changed 21 months ago by mark

  • Milestone changed from 28 to 29

comment:16 Changed 20 months ago by emmons

  • Milestone changed from 29 to upcoming
View

Add a comment

Modify Ticket

Action
as accepted
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.