Reality check - non-Unicode in Guinea-GTZ documents 2005

From: Don Osborn (dzo@bisharat.net)
Date: Wed Aug 16 2006 - 07:29:46 CDT

  • Next message: Johannes Bergerhausen: "Re: Reality check - non-Unicode in Guinea-GTZ documents 2005"

    Looking at these three PDF documents produced by two Guinean agencies and GTZ in 2005:
    http://www.srp-guinee.org/download/glossaire-pes-maninka-3c-08-02.05.pdf
    http://www.srp-guinee.org/download/glossaire-pes-pular-3c-08-02-05.pdf
    http://www.srp-guinee.org/download/glossaire-pes-soso-3c-08-02-05.pdf
    (backlink is http://www.srp-guinee.org/bibliotheque.htm )
    ... one notes that they have not used a Unicode font in producing them and evidently used more than one font, with nonstandard coding for extended characters. The result is irregular in appearance, but more importantly it's in an 8-bit coding. This of course encumbers any effort to search the documents with extended characters or to copy text from them to other documents.

    The sad thing is that it could have easily been done "right" from the start with a Unicode font. There are no complicating issues of diacritics for tone marking involved so it's very straightforward. That an international collaboration at this time should be resorting to this kind of workaround in a project of some importance speaks to how far we have yet to go in disseminating basic information about how Unicode works, why use Unicode fonts, ... and perhaps the existence of Unicode iteself.

    Although it is encouraging to see such attention to African languages in a development project, it's a bit discouraging that knowledge of Unicode did not get disseminated among partners in such an information-for-development effort funded by an international organization like GTZ - which has access to technical information in general and perhaps uses Unicode in other contexts (departments?) already anyway. Indeed, such international development organizations could be a little proactive in the matter.

    Realistically, though, this is another case of a "divide" - in addition to the oft-cited linguist-technician divide, it is clear that development experts also frequently are not up on what internationalization has achieved with regard to basic multilingual computing. So the development agency and local partners contract with the local linguist experts in various languages and no one among them realizes that the process and end project could be so much improved though use of Unicode fonts that are readily available - and possibly already installed on the computers they are working on!

    At this point I am aware of discussion of a couple of possible workshops in West Africa on African language computing, which would among other things explain more about Unicode. It seems however that we still have a lot of work to do with the international and donor organizations as well - I'm sure GTZ is not the only one. (Again, they and the Guinean agencies involved do deserve credit for at least making the effort. Although it would be aggravating in a way, I wouldn't mind at this point coming across similar problems from projects funded by USAID, DfID, French, Chinese, UNDP, WB, etc. - the Canadian IDRC is already a step or two ahead on this, from having engaged various ICT4D and localization issues more or less directly for several years).

    Don Osborn
    Bisharat.net
    PanAfrican Localisation project



    This archive was generated by hypermail 2.1.5 : Wed Aug 16 2006 - 07:54:29 CDT