From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Jan 05 2004 - 20:37:30 EST
[Doing a little cut and pasting here to coalesce the context...]
> Peter Kirk wrote,
> >
> > I note an incorrect glyph for U+0185 in Code2000 and in Arial Unicode
> > MS; this looks like b with no serif at the bottom but should be much
> > shorter, like ь, the Cyrillic soft sign.
>
James Kass responded:
> ... With regards to U+0185, could it be
> said that the informative glyph in TUS 2.0, 3.0 and 4.0 is a bit
> misleading, or does that glyph represent a variance from the
> text(s) with which you're familiar?
>
> This page uses a scan from THE LANGUAGES OF THE WORLD
> as its Chuang example:
> http://www.worldlanguage.com/Languages/Chuang.htm
>
> No sample text, no lower case illustration:
> http://www.alphabets-world.com/chuang.html
>
> If the informative glyph in TUS *is* misleading, I'll be happy
> to make appropriate changes here.
Peter Kirk responded:
> Yes, you are right, and using a very British hyperbole [recte: litotes].
> The TUS 4.0
> glyph is quite simply incorrect. That is, it is incorrect for the
> Azerbaijani, Khakass and Nogai letter, and it does not make a proper
> distinction from the otherwise almost identical "b". The glyph should
> have the same height as most lower case letters. ... That is, shorter
> than the reference glyph in TUS 4.0. This reference
> glyph needs to be changed. I would suggest a form identical to U+0446.
Before we go charging off to fix all the fonts, we first need
to have clarity regarding which characters are intended for what
here.
Michael Everson has asserted that U+0184/U+0185 *are* the intended
characters for the Pan-Turkic Latin alphabetic use of the Cyrillic
soft sign letter. This is at odds with the history of the Unicode
Standard and with Michael's own prior assertion in:
http://www.evertype.com/standards/iso10646/pdf/turkmen.pdf
"Latin <soft sign> [is] not encoded in the UCS, complicating
things like monolingual multiscript ordering since the current
UCS expects Cyrillic <soft sign> to do double duty." [2000-06-02]
That earlier statement by Michael correctly reflects the intent
of the standard, I believe. It also correctly reflects Michael's
observation earlier today:
> In Pan-Turkic, though, it looks just like CYRILLIC SOFT SIGN in all
> the sources I have seen. For lots of languages.
And the Unicode solution for that, to date, has been that since
it "looks just like" the CYRILLIC SOFT SIGN in all the sources,
by gum, it *is* the CYRILLIC SOFT SIGN.
[Now don't pile on all at once regarding mixed scripts for
alphabets and rehearsing for the umpteenth time the arguments
about Kurdish Q/W. We've heard all that, and there are
abiding philosophical differences in the committees regarding
when letters borrowed from one script into another become
nativized into that script and require separate encoding.
That is all for another thread. What I am telling you all
here is what the *intent* of the standard has been regarding
this *particular* pair of letters, since 1991.]
The upshot of that is that the glyphs for U+0184/U+0185 are
not to be determined by Azeri/Khakass/Nogai typography, but
by Zhuang typography, for which they were encoded. The
glyphs for U+042C/U+044C are correct for representing the
soft sign in the Pan-Turkic alphabet because, well, they
*are* the soft sign.
Now, let's review the intent for Zhuang orthography. (aka Chuang)
Based on sources such as Katzner (cited in this thread on
available on the web) and Nakanishi, the 5 Zhuang tone
letters were encoded in Unicode as:
Tone 2: U+01A7/U+01A8 (reversed s)
Tone 3: U+0417/U+0437 (Cyrillic ze)
Tone 4: U+0427/U+0447 (Cyrillic che)
Tone 5: U+01BC/U+01BD (roughly 5-shaped letter)
Tone 6: U+0184/U+0185 (similar to soft-sign, but not identical)
Everyone recognizes that the tone letters were mnemonically
based on 2, 3, 4, 5, 6, as well, but there was no point in
actually *using* the digits, as the tone letters are actually
shaped differently and their usage would interfere with the
use of normal digits in Zhuang text.
The Unicode shapes and tone letter identities for Zhuang are
roughly consonant with those also shown at:
http://www.alphabets-world.com/chuang.html
except that the glyph for Tone 4 there is much less che-like in
shape, but still not actually a "4". Running text citations,
as in Katzner, clearly show Cyrillic ze and che in use for those
tones. The debatable edge case was always for tone 6, where you
could argue that the Zhuang citations were merely an "off" shape
for a Cyrillic soft sign that happened to be used in the text.
But as for tones 2 and 5, the more conservative approach taken
at the time, in 1990, was to simply identify Zhuang tone 6 as
a distinct form, not identical to the soft sign, and so it
was separately encoded at U+0184/U+1085.
Note that there are more modern representations of Zhuang that
dispense with the special tone letters altogether and
substitute out ordinary Latin letters, in a Pinyin-like
simplification. See:
http://www.liuzhou.co.uk/liuzhou/language.htm
with a sign showing the substitution of Latin J, H, Z, X, W(?)
for the 5 Zhuang tone letters.
This may reflect an official attempt to establish a new
Latin orthography for Zhuang. See:
http://www.infomekong.com/zhuang_secondary.htm
"The language was not written down until the government
made an attempt in the early 1950's, but they chose to use a
Russian script [sic] and it was never accepted by the
people. A new Latin script was devised in 1986 and the government
through the Minorities Language Commission has encouraged Zhuang
to learn this."
For more background on the political context of Zhuang
orthography development, see:
http://brj.asu.edu/v2512/articles/art8.html
In particular, the about-face by the central government
regarding minority community policies in the late 50's
impacted the history of the Zhuang orthography's use:
"In the middle 1960s, the new Zhuang, Lisu, and Lahu
written languages were withdrawn from the few schools
where they had survived the promotion of Chinese in the
late 1950s."
I presume that the 1986 orthography is what is shown in the Liuzhou
sign noted above.
So in any case we may be talking about the encoding of the
tone letters for a failed attempt at establishing a
Latin/Cyrillic hybrid orthography that failed in the late 1950's
and early 1960's in China. It is unclear to me whether the
revival of the use of written Zhuang in the 1980's is based
on the original Zhuang forms or a revision of them without
the Cyrillic-based additions and tone letters.
Perhaps someone on the list who knows more about the actual
history of orthographic reform in the Zhuang Autonomous Region
of Guangxi could chime in with more details.
--Ken
This archive was generated by hypermail 2.1.5 : Mon Jan 05 2004 - 21:14:48 EST