Re: Indic grapheme clusters

From: Mark Davis (mark.davis@icu-project.org)
Date: Fri Jul 14 2006 - 09:00:33 CDT

  • Next message: Upshur Whittock: "Question about some characters"

    The default grapheme cluster boundaries are defined in

    UAX 29 Text Boundaries <http://www.unicode.org/reports/tr29/>
    As to your specific question, they are not included in the default grapheme
    clusters, which are fairly narrowly defined because there was disagreement
    about whether they should include indic-style clusters (and if so, for which
    scripts). If you look at the very earliest draft proposal, they were in at
    that time. However, it is certainly possible to define tailored grapheme
    clusters that do include them. I think at one point Apple was proposing
    adding them to CLDR.

    Mark

    (If you're trying to find stuff, one place to try first is
    http://www.unicode.org/faq/specifications.html. It uses the more informal
    term "user characters", though, and should be updating for 5.0.)

    On 7/14/06, Richard Ishida <ishida@w3.org> wrote:
    >
    > I've been trying to find out whether a simple indic conjunct such as
    >
    >
    > 0915: क DEVANAGARI LETTER KA
    > 094D: ् DEVANAGARI SIGN VIRAMA
    > 0915: क DEVANAGARI LETTER KA
    > 093E: ा DEVANAGARI VOWEL SIGN AA
    >
    > is a single "default grapheme cluster" or not.
    >
    > I've seen text that says that it is, but I'm really struggling to figure
    > out how the standard tells you that. I've looked at
    > http://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries and the
    > http://www.unicode.org/Public/UNIDATA/auxiliary/GraphemeBreakProperty.txtfile. I can't even find explanations of the format of the
    > GraphemeBreakProperty.txt file.
    >
    >
    > Can someone help?
    >
    > RI
    > ============
    > Richard Ishida
    > Internationalization Lead
    > W3C (World Wide Web Consortium)
    >
    > http://www.w3.org/People/Ishida/
    > http://www.w3.org/International/
    > http://people.w3.org/rishida/blog/
    > http://www.flickr.com/photos/ishida/
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Fri Jul 14 2006 - 09:06:13 CDT