I know, I know: the "top 100" languages list is utter non-sense and surely
does not fit the public relation needs of The Unicode Consortium.
However, as some people took the time to send me corrections and advice, I
tried to integrate them in the list, just for our amusement.
* John Cowan > "Azerbaijan has switched to Latin."
[I moved it]
* Joerg Knappen > Sunda uses Latin; Oromo uses Ethiopic.
[I moved them]
* Roozbeh Pournader > "Sindhi is written in Arabic script."
[I moved it]
* Thomas Chan > "... Other than Mandarin Chinese and Yue Chinese, the
other "Chinese" ones don't really have developed writing traditions, so the
question is sort of academic..."
[See next]
* John Cowan and I > similar concern for Italian dialects.
[I collapsed most "dialects" under the entry of the "national language"
spoken in the area, assuming that speakers of these languages would use the
"national language" in writing (especially on computers)]
* Kent Karsson > "... That does not even cover all of the official
languages of the EU! So that "top 100" statement would be highly
UNimpressive..."
[EU languages are not more important than others; moreover many other
languages are missing. Some of these languages (e.g. Hebrew) are relevant
for Unicode because they use a special script, or are tricky, or are "often
used" on computers, so I sort of added them without estimates]
* Janko Stamenovic > split Serbo-Croatian in Serbian (rough estimate:
8..10 millions) and Croatian.
[The divorce is done: 10 millions to Serbian and the rest to Croatian]
Here are the revised statement and the new list (ordered by writing systems;
the numbers show an estimate of the people speaking each language, in
millions).
"Unicode supports the top 100 languages. Unicode also supports all the
official languages used in the EU and many other languages, some of which
require unique writing systems."
*** Latinate alphabet
332 SPANISH
322 ENGLISH
170 PORTUGUESE
98 GERMAN
76 JAVANESE
72 FRENCH
68 VIETNAMESE
59 TURKISH
46 ITALIAN
44 POLISH
31 AZERBAIJANI
27 SUNDA
26 ROMANIAN
24 HAUSA
20 DUTCH
20 YORUBA
18 MALAY (also written in Arabic)
17 INDONESIAN
17 IGBO
17 TAGALOG
15 HUNGARIAN
12 CZECH
11 CROATIAN
9 MALAGASY
9 RWANDA
9 SOMALI
9 ZULU
9 SWEDISH
8 NIGERIAN FULFULDE
7 HAITIAN CREOLE FRENCH
(all other official languages in the EU)
*** Greek alphabet
12 GREEK
*** Cyrillic alphabet
170 RUSSIAN
41 UKRAINIAN
18 NORTHERN UZBEK
10 BELARUSAN
10 SERBIAN (also written in Latinate)
9 BULGARIAN
8 TATAR
8 KAZAKH
7 UYGHUR
*** Armenian alphabet
(ARMENIAN)
*** Hebrew alphabet
(HEBREW)
(YIDDISH)
*** Arabic alphabet
175 ARABIC (all dialects)
58 URDU
31 FARSI
30 WESTERN PANJABI
20 SINDHI
18 PASHTO
*** Thaana alphabet
(MALDIVIAN)
*** Devanagari alphabet
182 HINDI
65 MARATHI
16 NEPALI
*** Bengali alphabet
189 BENGALI
14 ASSAMESE
*** Gujarati alphabet
44 GUJARATI
*** Gurmukhi alphabet
26 EASTERN PANJABI
*** Oriya alphabet
31 ORIYA
*** Tamil alphabet
63 TAMIL
*** Telugu alphabet
66 TELUGU
*** Kannada alphabet
34 KANNADA
*** Malayalam alphabet
34 MALAYALAM
*** Sinhala alphabet
13 SINHALA
*** Thai alphabet
35 THAI
*** Lao alphabet
(LAO)
*** Myanmar alphabet
22 BURMESE
*** Georgian alphabet
(GEORGIAN)
*** Hangul script
75 KOREAN (also uses CJK ideographs, a.k.a. hanja)
*** Ethiopic script
17 AMHARIC
9 OROMO
*** Cherokee script
(CHEROKEE)
*** Canadian syllabic script
(INUIT)
*** Khmer alphabet
7 KHMER
*** Mongolian alphabet
(MONGOLIAN)
*** Braille patterns
(many languages worldwide)
*** Kana script
125 JAPANESE (also uses CJK ideographs, a.k.a. kanji)
*** CJK ideographs (a.k.a. hanzi, kanji, hanja)
885 MANDARIN CHINESE
66 YUE CHINESE
282 (other Chinese dialects)
*** Yi script
(YI)
*** Unknown (unwritten?)
25 BHOJPURI
24 MAITHILI
21 AWADHI
15 SARAIKI
15 CEBUANO
14 CHITTAGONIAN
14 MADURA
13 HARYANVI
12 MARWARI
12 MAGAHI
11 CHHATTISGARHI
10 DECCAN
8 ILOCANO
7 SHONA
7 KURMANJI
7 HILIGAYNON
7 AKAN
THE END
Ciao.
Marco
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT