From: Donald Z. Osborn (dzo@bisharat.net)
Date: Tue Apr 14 2009 - 09:23:42 CDT
Thanks to all for the feedback on this topic. It sounds like the
choice of utf-8 or not is mainly one of policy (or lack of same) and
not technical restraints?
Interesting on this point to contrast with VOA,* which has all of its
language pages in utf-8.
On the other hand, while BBC uses lang= parameter in page coding to
indicate the main language in each page, VOA pages are apparently all
lang="en"
Like BBC, VOA ASCIIfies Hausa Boko orthography. It also has no text in
Amharic or Tigrinya (among non-Latin scripts), only audio from an
English language "Horn" page.
Like BBC, it groups the similar languages Kinyarwanda and Kirundi on a
single page (with text in one, the other, both, or something
inbetween). It would be interesting to know what exactly is the
language of the text content of that page. BBC codes their page "rw"
(for Kinyarwanda), not "rn" (for Kirundi), even though both languages
share it. But as already noted, VOA incorrectly uses lang="en"
everywhere.
* http://www.voa.gov (click on Languages) or
http://www.voanews.com/english/screen_map.cfm
This archive was generated by hypermail 2.1.5 : Tue Apr 14 2009 - 10:13:30 CDT