Re: Zero Width Word Boundary

From: Atif Gulzar (atif.gulzar@gmail.com)
Date: Fri Jan 30 2009 - 01:14:48 CST

  • Next message: Javier SOLA: "Re: Zero Width Word Boundary"

    > According to Section 11.1 on Thai in TUS 5.0 (p. 376), and Section 16.2 on
    > layout controls (p. 535), U+200B ZERO WIDTH SPACE is the right character for
    > marking word boundaries in languages like Thai which don't use visible
    > spaces between words. I don't see why this would be different for Lao.

    Lao script is close to Thai but it has different script block (U+0E80
    to U+0EFF) and language processing rules. Unlike Thai, Lao script can
    be break at syllable level at line breaks.

    http://www.panl10n.net/english/final%20reports/pdf%20files/Laos/LAO06.pdf

    --
    Best Regards,
    Atif Gulzar
    I ◘◘◘◘ Unicode, ɹɐzlnƃ ɟıʇɐ
    On Fri, Jan 30, 2009 at 11:59 AM, Doug Ewell <doug@ewellic.org> wrote:
    > ɹɐzlnƃ ɟıʇɐ <atif dot gulzar at gmail dot com> wrote:
    >
    >> I have checked and could not find any Unicode character for word separator
    >> (zero width space as WORD separator). This character/code is needed for
    >> languages where space is not used as word separator. The available zero
    >> width characters are incapable to address this issue. e.g.
    >>
    >> U+200B Zero Width Space: This character is intended for line break control
    >> (In Lao language lines can be broken at syllable levels, Lao uses U+200B to
    >> mark syllable boundaries).
    >> ...
    >
    > According to Section 11.1 on Thai in TUS 5.0 (p. 376), and Section 16.2 on
    > layout controls (p. 535), U+200B ZERO WIDTH SPACE is the right character for
    > marking word boundaries in languages like Thai which don't use visible
    > spaces between words.  I don't see why this would be different for Lao.
    >
    > --
    > Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
    > http://www.ewellic.org
    > http://www1.ietf.org/html.charters/ltru-charter.html
    > http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ
    >
    >
    


    This archive was generated by hypermail 2.1.5 : Fri Jan 30 2009 - 01:17:23 CST