about http://www.w3.org/TR/ruby/

From: Yung-Fong Tang (ftang@netscape.com)
Date: Fri Jan 21 2000 - 15:08:56 EST


One comment about http://www.w3.org/TR/ruby/

     In traditional Chinese, "Bopomofo" ruby can appear along the
     right side of the ruby base, even in horizontal layout.

     [Example of Bopomofo ruby, with Bopomofo on the right side of the base texts in horizontal layout]

                                               Figure 1.1.8:
     "Bopomofo" ruby in traditional Chinese (ruby text
                                               shown in blue for
     clarity) in horizontal layout

     Note that Bopomofo tone marks (in the above example shown in
     red for clarity) appear in a separate column (along the right
     side of the Bopomofo ruby) and therefore might be seen as
     "ruby on ruby". However, they are encoded as combining
     characters that are simply part of the ruby text.

I don't think this information is accurate. Let's look at the Big5
mapping: ( ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/OTHER/BIG5.TXT
)

0xA3BB 0x02D9 # DOT ABOVE (Mandarin Chinese light tone)
0xA3BC 0x02C9 # MODIFIER LETTER MACRON (Mandarin Chinese first tone)
0xA3BD 0x02CA # MODIFIER LETTER ACUTE ACCENT (Mandarin Chinese second
tone)
0xA3BE 0x02C7 # CARON (Mandarin Chinese third tone)
0xA3BF 0x02CB # MODIFIER LETTER GRAVE ACCENT (Mandarin Chinese fourth
tone)

Notice it come with the following note also.
# 2. There is an uncertainty in the mapping of the Big Five
character
# 0xA3BC. This character occurs within the Big Five block
of tone marks
# for bopomofo and is intended to be the tone mark for the
first tone in
# Mandarin Chinese. We have selected the mapping U+02C9
MODIFIER LETTER
# MACRON (Mandarin Chinese first tone) to reflect this
semantic.
# However, because bopomofo uses the absense of a tone
mark to indicate
# the first Mandarin tone, most implementations of Big
Five represent
# this character with a blank space, and so a mapping such
as U+2003 EM SPACE
# might be preferred.

And also the unicode
database-ftp://ftp.unicode.org/Public/3.0-Update/UnicodeData-3.0.0.txt

02C7;CARON;Sk;0;ON;;;;;N;MODIFIER LETTER HACEK;Mandarin Chinese third tone;;;
02C8;MODIFIER LETTER VERTICAL LINE;Sk;0;ON;;;;;N;;;;;
02C9;MODIFIER LETTER MACRON;Sk;0;ON;;;;;N;;Mandarin Chinese first tone;;;
02CA;MODIFIER LETTER ACUTE ACCENT;Sk;0;ON;;;;;N;MODIFIER LETTER ACUTE;Mandarin Chinese second tone;;;
02D9;DOT ABOVE;Sk;0;ON;<compat> 0020 0307;;;;N;SPACING DOT ABOVE;Mandarin Chinese light tone;;;

Here are some problems about http://www.w3.org/TR/ruby/:
1. http://www.w3.org/TR/ruby/ claim "Bopomofo tone marks ... are encoded
as combining characters". However, this is not correct according to the
current mapping. None of the characters listed above are combining
characters.

one may argue it is a mapping table problem and should map these tone
mark from Big5 to the following unicode character.

302A;IDEOGRAPHIC LEVEL TONE MARK;Mn;218;NSM;;;;;N;;;;;
302B;IDEOGRAPHIC RISING TONE MARK;Mn;228;NSM;;;;;N;;;;;
302C;IDEOGRAPHIC DEPARTING TONE MARK;Mn;232;NSM;;;;;N;;;;;
302D;IDEOGRAPHIC ENTERING TONE MARK;Mn;222;NSM;;;;;N;;;;;

However, there are not clear explaination about these characters in the
Unicode 2.0 standard. Also, the glyph currently used to print these
Unicode characters are very different from the Bopomofo tone mark. Two
of them even show on the left corner of the base characters. It
definitely not practical to use them for bopomofo tone mark- Howerver,
that is a issue with in Unicode standard, not ruby spec

2. The ruby spec does not show one special but common case of Mandarin
Chinese light tone mark [0xA3BB in Big5] - Most of the printing I saw
which have this light tone mark print it on top of the first Bopomofo
instead at the right side of the middle Bopomofo where other tone marks
locate.

I think this paragraph over simplified the bopomofo tone mark layout
issue. The bopomofo tone mark definitely is not a ruby of ruby, but a
glyph position issue. Howerver, it is neither a combining mark in turn
of the definitation of the current Unicode mapping and database.





This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT