Re: issues storing ZWSP in docs, files and databases

From: Doug Ewell (dewell@roadrunner.com)
Date: Sat Aug 25 2007 - 18:55:58 CDT

  • Next message: Mark Davis: "Re: Apostrophes at www.unicode.org"

    Ngwe Tun wrote:

    > We have to use ZWSP for the word breaking in our language. So, We need
    > to use ZWSP for line breaking purpose too. Every Burmese word might
    > follow ZWSP when automatically adding or operator.
    >
    > Please let me have last clarification. Do We need to store ZWSP in
    > documents, files and database for the purpose of word
    > segmentation/breaking? Or Is it possible to add automatically in
    > others way?

    Burmese text will either have ZWSP between words, which means electronic
    processes can automatically determine word boundaries, or it will not,
    which means they cannot. Unicode does not tell you that you must use
    ZWSP in Burmese text, only that "if word boundary indications are
    desired" then ZWSP is the right character for the job.

    A program could probably be written to add ZWSP to existing Burmese
    text. Such a program would almost certainly be dictionary-based and
    would need to allow a human to review the text and fix any possible
    erorrs or ambiguities.

    --
    Doug Ewell · Fullerton, California, USA · RFC 4645 · UTN #14
    http://users.adelphia.net/~dewell/
    http://www1.ietf.org/html.charters/ltru-charter.html
    http://www.alvestrand.no/mailman/listinfo/ietf-languages
    


    This archive was generated by hypermail 2.1.5 : Sat Aug 25 2007 - 18:58:36 CDT