[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search
 
Modify

CLDR Ticket #9680(closed data: fixed)

Opened 10 months ago

Last modified 4 months ago

Add three most widely known languages by country in EU per Eurostat

Reported by: federicoleva@… Owned by: rick
Component: supplemental Data Locale:
Phase: dsub Review: pedberg
Weeks: Data Xpath:
Xref:

Description

In addition to http://unicode.org/cldr/trac/ticket/9114 for minority languages, and going beyond http://unicode.org/cldr/trac/ticket/7102 which was just about Italian, it would be nice to ensure that all European countries have data at least on the most important foreign languages spoken.

The "usual" special Eurobarometer 386 provides a table (p. 21, d48f) of the three most widely known foreign languages in each EU country, based on the ability to hold a conversation with a native speaker self-reported by interviewed persons. There's also a table on the ability to follow the news on radio or TV in English, French, German, Spanish or Russian (a selection similar to Eurydice 2012) but that feels less useful for CLDR.

I've also emailed DG COMM to see if detailed data is available (the relevant person should be Ian Barber, does anyone in CLDR TC have contacts?).

Attachments

ebs386-d48t.png (78.1 KB) - added by federicoleva@… 10 months ago.
Table from ebs 386

Change History

Changed 10 months ago by federicoleva@…

Table from ebs 386

comment:1 Changed 10 months ago by federicoleva@…

Numbers from ebs 386 which CLDR currently lacks or is very far from (while figures for the others are often identical, suggesting the data is consistent overall):

BE: de 22 %
BG: ru 23 %, de 8 %
CZ: sk 16 %, de 15 %
DK: de 47 %, sv 13 %
EE: ru 56 %, en 50 %, fi 21 %
GR: fr 9 %, de 5 %
FR: es 13 %, de 5 %
CY: fr 7 %, el 5 %
LV: en 46 %
LT: ru 80 %, de 14 %
LU: en 56 %
IE: ga 22 %, fr 17 %
LT: de 14 %
HU: de 18 %, fr 3 %
MT: it 56 %, fr 11 %
AT: fr 11 %, it 9 %
PL: de 19 %, ru 18 %
PT: fr 15 %, es 10 %
RO: fr 17 %, es 10 %
SI: hr 61 %, de 42 %
SK: cz 47 %, de 22 %
FI: sv 44 %, de 18 %
GB: fr 19 %, de 6 %

comment:2 Changed 10 months ago by waldir.pimenta@…

Hopefully this is the right place to comment: regarding Portugal, the CLDR data definitely seems lacking. French and Spanish are reasonably understood (due to historical migration patterns[1] and linguistic/geographic proximity, respectively), and Galician even more so than Spanish, due to it being closer to Portuguese. The 0.1% for Spanish and Galician seems definitely inaccurate even as an educated guess, and the lack of French puzzling at best.

[1] https://en.wikipedia.org/wiki/Portuguese_people#Portuguese_diaspora

comment:3 Changed 10 months ago by mormegil@…

I can confirm CZ–SK (and vice versa, SK–CS) are obviously missing, since Czech and Slovak are basically mutually intelligible. So even the 16%/47% figures are somewhat suspicious, but I guess that depends on what exact threshold of “to be able to have a conversation” do you use.

comment:4 Changed 10 months ago by mark

  • Owner changed from anybody to rick
  • Priority changed from assess to medium
  • Type changed from unknown to data
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 30

comment:5 Changed 9 months ago by rick

Updated from the table in comment 1, noting a couple of possible transcription errors from the PNG.

comment:6 Changed 9 months ago by rick

  • Status changed from accepted to reviewing
  • Review set to pedberg

comment:7 Changed 9 months ago by federicoleva@…

Thanks, sorry for the "cz" typo and any other. The commit looks good. I note two lines were not added/updated: de in CZ (it's more than pl), de in GR.

comment:8 Changed 9 months ago by rick

Thanks Federico. Updated with those two items, which were missed.

comment:9 Changed 9 months ago by rick

CY needs to be fixed, as Greek (el) is nearly universal. Set to 95%.

comment:10 Changed 9 months ago by pedberg

  • Status changed from reviewing to closed
  • Resolution set to fixed

comment:11 Changed 4 months ago by federicoleva@…

I've also emailed DG COMM to see if detailed data is available (the relevant person should be Ian Barber, does anyone in CLDR TC have contacts?).

It seems that detailed data was released to http://languageknowledge.eu/about .

View

Add a comment

Modify Ticket

Action
as closed
Next status will be 'new'
Next status will be 'closed'
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.