UDHR in Unicode: 400 translations in text form!
eric.muller at efele.net
Mon Jun 29 08:49:22 CDT 2015
On 6/28/2015 12:20 PM, Philippe Verdy wrote:
> Note: The marker icons showing languages in the Leaflet component
> (over the OSM map) are not working (broken links)
Fixed, I believe.
> Also the locations assigned of some international languages is strange:
> Esperanto ... Picard ... Standard French
These locations for those come from http://glottolog.org. Unless those
locations are obviously wrong, I'd prefer to keep them aligned.
> But in fact I would have placed those international languages
> somewhere in the middle of an ocean, just aligned vertically in a list
> along a meridian (across the Atlantic or Pacific for example)
A few are already in Antarctica. I'll move Esperanto and Interlingua there.
> Some languages do have an ISO 639-3 code. E.g.
> - Tetum, official in Timor-Leste, is currently "coded" as "010"
> (mapped to "und" in ISO 639-3), it should be "tet".
In general, identification of the language of the translations is not
trivial. I have learned to not trust just the names provided with the
For this one, there is another translation, [tet], which most likely is
tet/Tetun.  looks like a fairly different language and it is not
clear to me that it is Tetun. I'd rather have some informed
recommendation before assigning a language to . It does not help
that the source site does not seem accessible right now.
> - Forro (Saotomense) is a Portuguese-based creole in Sao Tome,
> currently "coded" as "007" (mapped to "und"), it should use "cri".
The OHCHR site warns: "not to confuse Crioulo Santomense with Santomense
(a variety and dialect of Portuguese in São Tomé and Príncipe)" Again,
I'd prefer some informed recommendation.
> - Kimbundu should also use "kmb" and not "009"
> - Umbundo (Umbundu) should also use "umb" and not "011"
According to the Ethnologue, both Kimbundu and Umbundu are used both as
language names and as family names. Given that I don't really trust the
sources of those names, I'd prefer some informed recommendation.
More information about the Unicode