Re: VOA- utf-8, lang="en" (Re: BBC.co.uk languages ...)

From: Mark Davis (mark.edward.davis@gmail.com)
Date: Tue Apr 14 2009 - 12:14:17 CDT

  • Next message: Kent Karlsson: "Re: proposal for the inclusion of the most basic outlining commands as characters"

    FYI, in Google we essentially ignore the language setting in the web page,
    because it is too often missing or wrong to be useful.

    Mark

    On Tue, Apr 14, 2009 at 07:23, Donald Z. Osborn <dzo@bisharat.net> wrote:

    > Thanks to all for the feedback on this topic. It sounds like the choice of
    > utf-8 or not is mainly one of policy (or lack of same) and not technical
    > restraints?
    >
    > Interesting on this point to contrast with VOA,* which has all of its
    > language pages in utf-8.
    >
    > On the other hand, while BBC uses lang= parameter in page coding to
    > indicate the main language in each page, VOA pages are apparently all
    > lang="en"
    >
    > Like BBC, VOA ASCIIfies Hausa Boko orthography. It also has no text in
    > Amharic or Tigrinya (among non-Latin scripts), only audio from an English
    > language "Horn" page.
    >
    > Like BBC, it groups the similar languages Kinyarwanda and Kirundi on a
    > single page (with text in one, the other, both, or something inbetween). It
    > would be interesting to know what exactly is the language of the text
    > content of that page. BBC codes their page "rw" (for Kinyarwanda), not "rn"
    > (for Kirundi), even though both languages share it. But as already noted,
    > VOA incorrectly uses lang="en" everywhere.
    >
    >
    > * http://www.voa.gov (click on Languages) or
    > http://www.voanews.com/english/screen_map.cfm
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Tue Apr 14 2009 - 12:16:55 CDT