Re: Myanmar script, Pali language and other unencoded conjuncts or punctuations

From: Antoine Leca (Antoine10646@leca-marti.org)
Date: Tue Jan 04 2005 - 14:03:28 CST

  • Next message: Antoine Leca: "Re: ISO 10646 compliance and EU law"

    Philippe Verdy wrote:
    > >> > I can easily find fonts for the Myanmar/Pali script,
    > >> > (none of them mapped to Unicode),
    > >>
    > >> Look after MyaZedi (http://www.myazedi.com/downloads/).
    >
    > The Myazedi website is now... empty: a page with graphics and
    > no active links...

    ... which is the very reason why I gave you the link directly to the
    download page.

    > And Pali characters are not present in it!

    What are "Pali characters"?

    > I know that some new characters are in the Unicode character
    > pipe. But the list of conjuncts is documented nowhere.

    What are you referring to?
    I see nothing in http://www.unicode.org/alloc/Pipeline.html

    I only know about N2827 (http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2827.pdf),
    but I was guessing it was not considered for approval. And anyway they are
    not new characters in any sense, the set of eleven glides as combination of
    4 medials is well known for years.

    > Also the rules related to the usage of "kinzi"

    Are the words of TUS 10.3 unclear here?
    Also is http://www.mcf.org.mm/unicode/rendering/kinzi.htm ambiguous?

    > or how this impacts the general encoding of other "normal"
    > clusters (with or without the kinzi).

    How can you feel it is a problem?

    > If there are some other similar variants of kinzi in
    > other Myanmar-scrip-based languages,

    I guess you mean, some kind of "repha", don't you?

    > it would be interesting to know that early,

    Well, Myanmar is encoded in Unicode/10646 for 10 years now. So "early" here
    is a /very/ relative concept.

    > Unicode just says
    > for now that this "kinzi" behavior is similar to the behavior of the
    > Devanagari RA, but my experience with it shows that it is
    > much more tricky to handle.

    In what way? I did not experience such problems, but I may easily miss
    something here.

    OTOH I believe Nagari RA is _very_ tricky. So "much more tricky", « c'est
    plus blanc que blanc, c'est nouveau comme couleur, cela vient de sortir...
    ;-) » [ For non-Frenchies: it is from "La publicité", well-known piece of a
    French humorist named Coluche; it was a satyre of French ads for washing
    powder, it translates to "it is more white than white; that is a new colour,
    it just comes out". ]

    > The complexity of the Myanmar script is not enough documented
    > by Unicode,

    What do you mean here?

    Oriya is about as complex as Myanmar is, although the complexities are not
    in the same things. Yet Myanmar got 4 pages, and Oriya about 12 lines (I am
    too lazy to check the actual figures). And we can find many such examples.
    Of course one can wrote a treaty about each one of the complex scripts, with
    list of conjuncts, dictionnaries, etc. However, doing so will result in a
    Unicode Standard which would count 10,000+ pages, and it would then be
    impossible to achieve the minimum level of overall quality.

    I agree with you that they are shortcuts in the Myanmar description,
    particularly about "pathologic" sequences. But this subject is not specific
    to Myanmar, it is common to all Indic scripts; and there is already a group
    working on that (and similar issues).

    > and unfortunately, the relevant and accurante resources about
    > it are quite hard to find on the web

    I do not believe scholars should _only_ consult the web. In fact, I consider
    it would be ill-advised to do so; and furthermore such behaviour might lead
    to bashes from the native scholars (and I already took my quota for it.)

    > (there may exist sources in local libraries,

    Perhaps public libraries... ?

    > free communications with MyanMar, the country, are too
    > severely controled by its government,

    This is probably off topic here, but I fail to see how control of
    communications could allow a government to prevent the script of its
    citizens to become known by Westerners (or Easterners for that matter).
    As far as I know, PRC also is using some form of control of the
    communications, and I believe PRC scholars are very influent in the IRG, and
    hence very influent in Unicode.

    Antoine



    This archive was generated by hypermail 2.1.5 : Tue Jan 04 2005 - 14:07:28 CST