Re: Script of U+0951 .. U+0954

From: Mark Davis (mark.davis@jtcsv.com)
Date: Thu Dec 05 2002 - 16:33:23 EST

  • Next message: John Cowan: "Re: Script of U+0951 .. U+0954"

    > with MS people (and not only me, but also Pothana's designer), MS answered
    > that the Unicode standard seemed to imply that these accents apply to
    > Devanagari script only.

    That is incorrect; all non-spacing marks should inherit the script of their
    base character. We need to make this clear in
    http://www.unicode.org/unicode/reports/tr24/ for Unicode 4.0

    Mark
    __________________________________
    http://www.macchiato.com
    ► “Eppur si muove” ◄

    ----- Original Message -----
    From: "Antoine LECA" <Antoine10646@leca-marti.org>
    To: <unicode@unicode.org>
    Sent: Thursday, December 05, 2002 11:13
    Subject: Re: Script of U+0951 .. U+0954

    > Peter Constable wrote:
    > >
    > > There is a potential concern in Uniscribe/OpenType: substitution and
    > > positioning rules in OT are organised hierarchically by script then by
    > > individual writing system / typographic groups (the label used is
    > > languages, but the intent is really groups of writing systems that share
    > > common typographic behaviours). Thus, a rule that handles positioning of
    a
    > > glyph for 0950 (or whatever) relative to some member of some class of
    > > glyphs must be entered somewhere under some particular script. Now,
    there
    > > is nothing that prohibits a font developer from creating multiple
    > > positioning rules for 0950 with different classes of base glyphs and to
    > > have a different one placed in the hierarchy under several different
    > > scripts.
    >
    > Fully agreed so far.
    >
    >
    > > But there may yet be an issue on the Uniscribe side: given a
    > > string of characters, which it will begin by mapping into a string of
    > > initial glyphs, it has to decide which script tag(s) to apply to
    portions
    > > of the string. What I don't know is whether it generally assumes
    combining
    > > marks belong to a specific script, or whether it allows combining marks
    to
    > > inherit their script from the base characters with which they combine.
    >
    > Look: in current Uniscribe, leading ZWJ and ZWNJ are discarded (i.e., with
    > input U+200B U+093E, you still get the circle meaning "incorrect
    combining",
    > even if this is perfectly correct Unicode as far as I understand.
    > So clearly, they have a problem with "backtracking" when the script is
    > not determined by the first character in stream. I can understand that.
    > OTOH, when ZWJ or ZWNJ come second or later in conjuncts, they are
    properly
    > handled. In every script it is relevant. What I would like to see, is that
    > the Indic accents be handled in the same way. And when I spoke about that
    > with MS people (and not only me, but also Pothana's designer), MS answered
    > that the Unicode standard seemed to imply that these accents apply to
    > Devanagari script only.
    > It looks like to me taht this Scripts.txt just confirm the MS point of
    view.
    > If this is as intended, that is fine, but that means that a bunch of new
    > character (with few or no added value) are to be added to some new
    revision
    > of Unicode.
    >
    > By the way, the situation is similar with the dandas (U+0964 and U+0965):
    > they only appear in the Devanagari and Myanmar blocks, but are used for
    many
    > other (all?) South-Asian scripts as well. Worse, they are often used, so
    > there is already many material that is encoded with these codepoints.
    > Luckily, dandas do not need special handling from complex script engines,
    > so it does not matter if Uniscribe decide they are Devanagri or
    script-less
    > (except perhaps on the selection of the font).
    >
    >
    > Antoine
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Thu Dec 05 2002 - 17:23:40 EST