RE: Representative glyphs for combining kannada signs

From: Kent Karlsson (kent.karlsson14@comhem.se)
Date: Tue Mar 28 2006 - 15:59:33 CST

  • Next message: Kent Karlsson: "RE: Variation Selectors"

    Antoine Leca wrote:
    > > Yes, and they already are. U+0308 COMBINING DIAERESIS vs. U+030B
    > > COMBINING DOUBLE ACUTE. There is no "umlaut" character...
    >
    > I did use Umlaut to clearly (at least I thought) denote the characteristic
    > German *feature*, NOT the codepoints.

    For typeset modern German text DIEARESIS is consistently used (though
    most often via precomposed letters).

    > > And m² is not at all the same as m2.
    >
    > I guess no, although I am not completely sure (particularly
    > since I expect
    > the second to read "m<SUP>2</SUP>" instead,

    No. While that is an good approach in the general case (for arbitrary
    power-to *math* expression), I think it is a bad idea for the SI unit powers.

    > >> So, if the original encoder does NOT make a distinction in
    > >> meaning between the two forms, why would Unicode require
    > >> him to encode this difference at codepoint level?
    > >
    > > How do you know if the "original encoder" makes the difference or
    > > not?
    >
    > Because *I* am the original encoder, in this stanza. :-)

    So you only read your own texts. Interesting... ;-)

    > Because my feeling (in fact, my interpretation of the Unicode
    > and ISCII
    > description) is that the Indic codepoints are abstract
    > characters, not those
    > elements which combine in defined ways to produce some
    > glyphic intermediate
    > elements, which only remains to be actually drawn by the
    > font, as it seems you are thinking.

    I do not see why characters in Indic scripts should be more "abstract"
    than for other scripts.

    > I base that view, first on the fact that the virama concept
    > forces a need
    > for some abstraction layer (reordering, combination,
    > so-called backstore,
    > etc.) which is absent even from Thai, and even more from
    > Western scripts;
    > and secondly because of the underlying nature of the
    > Brahmi-derived scripts,
    > with the sounds associated, the sandhi phenomena, etc.

    The "sounds associated" are completely and totally irrelevant.
    Unicode encodes scripts, not sounds.

    > when the author
    > is supposed to add some precision; this is much like the
    > character styles
    > used in Western typography (rendered as HTML spanning styles,
    > for example).

    That does not apply to different spellings. I would not expect any
    kind of style span (HTML or otherwise) to say "display 'š' as 'sh'".
    Nor do I expect any acceptable font to have an "sh" glyph for "š".

    > > I have a really hard time understanding why apparent spell changes
    > > should be mediated by fonts changes for Indic scripts. It is not the
    > > done that way for any other scripts
    >
    > Huh?

    See reply above.

    > If I want a rounded 'a' in Latin, I am required to
    > select a font with
    > such a design. Similar for a z or a J with descender, or a
    > low-striked q. I
    > do not expect to be forced to use the "alternative"
    > codepoints, that have
    > been added for special purposes, like U+0251 or U+0292, for
    > an illustrative
    > use where I do NOT want to add specific meaning.
    >
    > The difference here is that you are saying changing a

    Will you please stop putting words in my mount!

    > z-shaped 'a' to a
    > rounded one (etc.) is *not* a spelling change, while writing
    > the i matra in
    > one or other place *is*. My wild guess is that some Indians may see it
    > exactly reversed...

    Some characters do have overlapping glyph chapes. However:

    *You* are saying that there are two "camps" (your word) for at least one
    of the Indic scripts as to how to display some letters. That sounds very
    much like a difference worthy of more than a font change. Likewise for
    the changes in Indic writing that are referred to as "old orthography" vs.
    "new orthograpy"; they are even CALLED spell changes, why not treat
    them as such then?

    > It is certainly such a difference (not purely aesthetic, I mean). See
    > attached image.

    I think that difference may be worthy of at least a ZWJ/ZWNJ...

    > Should emphasis be recorded as different Unicode codepoints?
    > My reading was it should not...

    No, and I did not say that.

    ...
    > The best I can find is the acknowledgement (in the Indic OpenType
    > specifications) that there is a need to distinguish two
    > genuinely different
    > "styles" in Uniscribe and related, one named "old style"
    > encoded MAL as
    > "language system", the other "reformed" encoded MLR.

    That does not seem (to me) to be anywhere near the ideal way of
    dealing with this.

                    /kent k



    This archive was generated by hypermail 2.1.5 : Tue Mar 28 2006 - 16:07:55 CST