[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #11381(closed: wontfix)

Opened 3 months ago

Last modified 5 days ago

Some and/or list issues

Reported by: kent.karlsson14@… Owned by: anybody
Component: other Data Locale:
Phase: dsub Review:
Weeks: Data Xpath:
Xref:

Description

See https://unicode.org/repos/cldr-aux/charts/34/by_type/miscellaneous.displaying_lists.html (as of 22/8, 2018)


{0} a(c) {1} ·cy·
{0} a(n) {1} ·lb·

{0} a(n) {1} ·lb·

Welsh uses "a" before a consonant and "ac" before a vowel. Perhaps "ac" is a suitable default. Seems better that using parentheses.

For Luxembourgish, only "an" is listed for "and" in (English) Wiktionary.


{0}, a(c) {1} ·cy·
{0}, agus {1} ·ga·
{0}, akked {1} ·kab·
{0}, at {1} ·fil·
{0}, dan {1} ·id·
{0}, ne-{1} ·zu·
{0}, እና {1} ·am·
{0}, ଓ {1} ·or·
{0}, ಮತ್ತು {1} ·kn·
{0}, සහ {1} ·si·
{0}, ᎠᎴ {1} ·chr·

{0}, atau {1} ·id· ·ms·
{0}, ella {1} ·fo·
{0}, neɣ {1} ·kab·
{0}, nó {1} ·ga·
{0}, o {1} ·fil·
{0}, or {1} ·all·others·
{0}, pe {1} ·br·
{0}, pē {1} ·to·
{0}, yaxud {1} ·az·
{0}, же {1} ·ky·
{0}, не болмаса {1} ·kk·
{0}, эсвэл {1} ·mn·
{0}, או {1} ·he·
{0}, يا {1} ·sd·
{0}, किंवा {1} ·mr·
{0}, वा {1} ·ne·
{0}, বা {1} ·bn·
{0}, અથવા {1} ·gu·
{0}, ಅಥವಾ {1} ·kn·
{0}, അല്ലെങ്കിൽ {1} ·ml·
{0}, හෝ {1} ·si·
{0}, ᎠᎴᏱᎩ {1} ·chr·
{0}، یا {1} ·fa· ·ur·
{0}、または{1} ·ja·

{0}, kple {1} ·ee·
{0}, u {1} ·mt·
{0}, සහ {1} ·si·

These seem to have copied the English-only idea of putting a comma before the "and"/"or" at the penultimate of such lists when three elements or longer. I do think it is an English-only idea, whatever the merits or dis-merits of doing so. It is not universally applied for English either. If you want to argue for the merits of punctuating this way, it should be done to various language committees, not "sneaked in" via CLDR.


{0}, {1} ·rm· ·all·others·

{0}, or {1} ·all·others·

As I have pointed out before, the "all others" in charts is NOT helpful; indeed it is anti-helpful. If it means "inherited from root without confirmation" (which I guess it might mean), then say so. This is a charts issue so far; but it hides errors. I'm sure that "or" is basically English, and for most other locales it is an error to use "or" without translation. If "all others" really means "inherited from root", then note that root should not contain any English. In case of "and" use "&" in root, and for "or", it is probably best to use "|" in root (using ∧ and ∨ would be possible, but not well recognisable to most readers). While English is wery much a "global" language, it is best kept out of root even so.


{0}{1} ·ja· (for "2" and "end"; as well as start&middle)

But Japanese has a bunch of words that means "and" (including そして, 及び and more). Not sure which is the best to use, but surely at least one of them is suitable.

English Wiktionary has "Japanese: 然して (そして, soshite)" for the "used at the end of a list to indicate the last item" case.


{0}, {1} എന്നിവ ·ml· (for start and middle)

This one seems strange. In use it will pile up a bunch of എന്നിവ after ALL elements of the list, like

a, b, c, d, e, ..... എന്നിവ എന്നിവ എന്നിവ എന്നിവ എന്നിവ...

like "polish notation" for arithmetic. I'm sure that is not intended here.


Attachments

Change History

comment:1 follow-up: ↓ 7 Changed 8 weeks ago by mark

If you want to argue for the merits of punctuating this way, it should be done to various language committees, not "sneaked in" via CLDR.

These are the results of vetters' review and entry of data. It is not "snuck in" by the CLDR-TC.

Now, it may be worth adding some text to the instructions for doing the list format to alert them to the issue, but that should be in a separate ticket. (And ad hominem remarks are not productive.)

"all others"

This means, when deployed using the standard inheritance model, this is the result. Now, any language with modern coverage would have an explicit value for this, meaning a native speaker did confirm it. Languages without modern coverage may just get the inherited value.

As for Japanese and Malayalam, I'll forward to translators.

comment:2 follow-up: ↓ 5 Changed 8 weeks ago by mark

For Malayalam, they wouldn't "pile up". See the example text in http://st.unicode.org/cldr-apps/v#/ml/Displaying_Lists/784271c9a4efa235

What you get for [a, b, c] is the following:

a, b X, c X, Y d

where X is എന്നിവ
and Y is അല്ലെങ്കിൽ

I'll still forward it, but the result is not the pile-up that you are concerned with (and that I'd be concerned with also!).

comment:3 Changed 8 weeks ago by yoshito

{0}{1} ·ja· (for "2" and "end"; as well as start&middle)
But Japanese has a bunch of words that means "and" (including そして, 及び and more). Not sure which is the best to use, but surely at least >one of them is suitable.

There are several different ways in Japanese. But I think "{0}{1}" would be the best option for this purpose. Other expressions depends on context. The current expression is somewhat neutral and fit well for any context.

English Wiktionary has "Japanese: 然して (そして, soshite)" for the "used at the end of a list to indicate the last item" case.

I've never seen the expression "然して". Hiragana expression - そして might be used although.

comment:4 Changed 8 weeks ago by yoshito

For Japanese, we may use following expression when we need embed and/or semantics in a list.

AND - XXX と YYY と ZZZ
OR- XXX か YYY か ZZZ

We don't use "XXX, YYY と ZZZ" "XXX, YYY か ZZZ".

comment:5 in reply to: ↑ 2 Changed 8 weeks ago by 002-ml_0003@…

Replying to mark:

For Malayalam, they wouldn't "pile up". See the example text in http://st.unicode.org/cldr-apps/v#/ml/Displaying_Lists/784271c9a4efa235

What you get for [a, b, c] is the following:

a, b X, c X, Y d

where X is എന്നിവ
and Y is അല്ലെങ്കിൽ

I'll still forward it, but the result is not the pile-up that you are concerned with (and that I'd be concerned with also!).
For Malayalam:

There should be only one എന്നിവ in such context. We can use only എന്നിവ and only one അല്ലെങ്കിൽ.
സ്വിറ്റ്സർലാൻഡ്, ജപ്പാൻ എന്നിവ, ഈജിപ്ത് എന്നിവ, അല്ലെങ്കിൽ കാനഡ should be replaced with: സ്വിറ്റ്സർലാൻഡ്, ജപ്പാൻ, ഈജിപ്ത് എന്നിവ, അല്ലെങ്കിൽ കാനഡ

comment:6 Changed 7 weeks ago by kent.karlsson14@…

Lists are "normally" interpreted as right associative (as in Lisp and many other programming languages that have "native" lists).

So in this case:

({0}, {1} Z)[a,b,c,d] -> a, ({0}, {1} Z)[b,c,d] Z -> a, b, ({0}, {1} Z)[c,d] Z Z -> a, b, c, d Z Z Z

You seem to have interpreted them as left associative. That is very odd from a "general" programmer's point of view, given the experience from several programming languages that have "native" lists.

Still, the result, also when the list are (oddly!) interpreted as left associative, is a bit strange. It is probably not the intended result. See comment 5.

comment:7 in reply to: ↑ 1 Changed 7 weeks ago by kent.karlsson14@…

Replying to mark:

These are the results of vetters' review and entry of data. It is not "snuck in" by the CLDR-TC.

(I did not write "-TC".)

So, did you consult the appropriate language committees/similar? Or even just prod vetters specially? If not, my statement still stands, since this is a point where it is extra easy to do a voting mistake.

comment:8 Changed 7 weeks ago by kent.karlsson14@…

A priori, with just commas, the list is ambiguous. One needs to say that this is an "and" list somehow.

(Perhaps aside, but relevant: It is not uncommon for "or"-lists to be written without a word indicating "or", but instead there is a preamble, e.g. "select at least one and at most 4 items from the list below". Often also written without comma but with "extras" such as check boxes; but could also be simple comma list. Also an "and" list can be written with just commas (or bullets...), but with a preamble, e.g. "all of the following are needed:".)

comment:9 Changed 8 days ago by mark

  • Milestone changed from UNSCH to to-assess

comment:10 Changed 5 days ago by mark

  • Status changed from new to closed
  • Resolution set to wontfix
  • Component changed from unknown to other

This is an issue for the translators to consider in the survey tool. Note that people can have a "preamble" to the list or a "post-amble".

View

Add a comment

Modify Ticket

Action
as closed
Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.