Re: GR and letter case Was: Gwoyeu Romatzyh marking the optional neutral tone

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue Jul 14 2009 - 13:52:08 CDT

  • Next message: John H. Jenkins: "Re: GR and letter case Was: Gwoyeu Romatzyh marking the optional neutral tone"

    Asmus,

    > > Typographically, it might sit just slightly to deep. However, being a
    > > spacing character, I wonder whether this is not just a glyph/font issue.
    > For some characters the "it's just slightly off" really means it's a
    > different character.

    It this case it does not, however.

    You and everybody else are arguing character encoding based on
    a well-known book that is a typographical one-off -- and a notoriously
    bad example of typography, at that. This book was typeset by a
    printer (U.C. Berkeley Press, 1968) which couldn't even insert
    typeset Chinese characters -- they were gapped, and then written in by hand.

    > The character we are trying to analyze is clearly a subscript. In the
          ^^^^^^^^^
          
    The glyph(s), rather. The identity of it as a character is
    precisely in question.

    > samples it harmonizes with the subscripted double prime (or double
    > vertical bar?) for the tertiary stress.

    Not always, because the examples vary in their typography.

    But the examples in that case help make Szabolcs' case, because
    the use of U+02C8 MODIFIER LETTER VERTICAL LINE and
    U+02CC MODIFIER LETTER LOW VERTICAL LINE for primary stress
    and secondary stress, respectively, is a well-known IPA convention
    that Y.R. Chao would have been familiar with. If you look
    at *that* set of characters on that page, then the low ring
    should indeed be construed as the U+02F3 MODIFIER LETTER LOW RING.
    (not IPA, but IPA-inspired).

    The other examples are a typographer's hack for the same thing,
    using a subscript letter "o" from the font.

    Incidentally, the use of the period in Y.R. Chao's system
    as a indication of neutral tone on a following syllable
    is also arguably related to the IPA system of supresegmentals,
    because in IPA, U+002E is used to indicate syllable breaks.

    >
    > In the samples
    > (http://www.stud.uni-karlsruhe.de/~uyhc/files/images/p1070002.preview.jpg)
    > it does not fully harmonize with the period used for neutral stress -
    > they appear to not have the same center, as one would perhaps expect.
    > However, the period is ordinarily positioned so that it aligns at the
    > top of these subscripts, which gives some consistency in appearance.

    This is way overanalyzed. This kind of argumentation from
    one-off (bad) typography is not a good precedent for
    getting more characters added to the standard of dubious
    semantics and appearance -- they only add to the confusion about
    which is the "correct" character to use in representing texts
    like this.

    > Not having actual samples of material using the low ring, I can only go
    > by its appearance in the charts, and it's quite a bit lower there, lower
    > than a subscripted character.
    >
    > As long as that chart glyph is not an aberration, I would very much
    > hesitate to forcibly unify these.

    And I think you would be wrong. Szabolcs is correct.

    For representing this particular convention in the Y.R. Chao's
    text (which isn't even a regular part of the Gwoyeu Romatzyh
    romanization which sees any wide use -- because it isn't
    a direct indication of pronunciation, but rather an editorial
    shorthand for saying that an alternative pronunciation with
    the tone neutralized also occurs), the best choice is:

    U+02F3 MODIFIER LETTER LOW RING

    For people who want to argue text appearance and alternative
    origin theories you also have available:

    U+006F LATIN SMALL LETTER O (styled subscript)

    U+2092 LATIN SUBSCRIPT SMALL LETTER O

    U+2080 SUBSCRIPT ZERO

    U+FF61 HALFWIDTH IDEOGRAPHIC FULL STOP

    U+3002 IDEOGRAPHIC FULL STOP

    So argue away and do what you will -- but the *last* thing the
    Unicode Standard needs is *another* low ring character encoded
    based on this evidence and this usage, to further muddy the waters.

    --Ken



    This archive was generated by hypermail 2.1.5 : Tue Jul 14 2009 - 13:55:08 CDT