[Unicode]   Common Locale Data Repository : Bug Tracking Home | Site Map | Search

CLDR Ticket #9680(closed: fixed)

Opened 2 years ago

Last modified 21 months ago

Add three most widely known languages by country in EU per Eurostat

Reported by: federicoleva@… Owned by: rick
Component: other-supplemental Data Locale:
Phase: dsub Review: pedberg
Weeks: Data Xpath:


In addition to http://unicode.org/cldr/trac/ticket/9114 for minority languages, and going beyond http://unicode.org/cldr/trac/ticket/7102 which was just about Italian, it would be nice to ensure that all European countries have data at least on the most important foreign languages spoken.

The "usual" special Eurobarometer 386 provides a table (p. 21, d48f) of the three most widely known foreign languages in each EU country, based on the ability to hold a conversation with a native speaker self-reported by interviewed persons. There's also a table on the ability to follow the news on radio or TV in English, French, German, Spanish or Russian (a selection similar to Eurydice 2012) but that feels less useful for CLDR.

I've also emailed DG COMM to see if detailed data is available (the relevant person should be Ian Barber, does anyone in CLDR TC have contacts?).


ebs386-d48t.png (78.1 KB) - added by federicoleva@… 2 years ago.
Table from ebs 386

Change History

Changed 2 years ago by federicoleva@…

Table from ebs 386

comment:1 Changed 2 years ago by federicoleva@…

Numbers from ebs 386 which CLDR currently lacks or is very far from (while figures for the others are often identical, suggesting the data is consistent overall):

BE: de 22 %
BG: ru 23 %, de 8 %
CZ: sk 16 %, de 15 %
DK: de 47 %, sv 13 %
EE: ru 56 %, en 50 %, fi 21 %
GR: fr 9 %, de 5 %
FR: es 13 %, de 5 %
CY: fr 7 %, el 5 %
LV: en 46 %
LT: ru 80 %, de 14 %
LU: en 56 %
IE: ga 22 %, fr 17 %
LT: de 14 %
HU: de 18 %, fr 3 %
MT: it 56 %, fr 11 %
AT: fr 11 %, it 9 %
PL: de 19 %, ru 18 %
PT: fr 15 %, es 10 %
RO: fr 17 %, es 10 %
SI: hr 61 %, de 42 %
SK: cz 47 %, de 22 %
FI: sv 44 %, de 18 %
GB: fr 19 %, de 6 %

comment:2 Changed 2 years ago by waldir.pimenta@…

Hopefully this is the right place to comment: regarding Portugal, the CLDR data definitely seems lacking. French and Spanish are reasonably understood (due to historical migration patterns[1] and linguistic/geographic proximity, respectively), and Galician even more so than Spanish, due to it being closer to Portuguese. The 0.1% for Spanish and Galician seems definitely inaccurate even as an educated guess, and the lack of French puzzling at best.

[1] https://en.wikipedia.org/wiki/Portuguese_people#Portuguese_diaspora

comment:3 Changed 2 years ago by mormegil@…

I can confirm CZ–SK (and vice versa, SK–CS) are obviously missing, since Czech and Slovak are basically mutually intelligible. So even the 16%/47% figures are somewhat suspicious, but I guess that depends on what exact threshold of “to be able to have a conversation” do you use.

comment:4 Changed 2 years ago by mark

  • Owner changed from anybody to rick
  • Priority changed from assess to medium
  • type changed from unknown to data
  • Status changed from new to accepted
  • Milestone changed from UNSCH to 30

comment:5 Changed 2 years ago by rick

Updated from the table in comment 1, noting a couple of possible transcription errors from the PNG.

comment:6 Changed 2 years ago by rick

  • Status changed from accepted to reviewing
  • Review set to pedberg

comment:7 Changed 2 years ago by federicoleva@…

Thanks, sorry for the "cz" typo and any other. The commit looks good. I note two lines were not added/updated: de in CZ (it's more than pl), de in GR.

comment:8 Changed 2 years ago by rick

Thanks Federico. Updated with those two items, which were missed.

comment:9 Changed 2 years ago by rick

CY needs to be fixed, as Greek (el) is nearly universal. Set to 95%.

comment:10 Changed 2 years ago by pedberg

  • Status changed from reviewing to closed
  • Resolution set to fixed

comment:11 Changed 21 months ago by federicoleva@…

I've also emailed DG COMM to see if detailed data is available (the relevant person should be Ian Barber, does anyone in CLDR TC have contacts?).

It seems that detailed data was released to http://languageknowledge.eu/about .


Add a comment

Modify Ticket

as closed
Next status will be 'new'
Next status will be 'closed'

E-mail address and user name can be saved in the Preferences.

Note: See TracTickets for help on using tickets.