hentaigana

From: Ben Monroe (bendono@attbi.com)
Date: Mon Apr 08 2002 - 11:24:25 EDT


[utf-8]

From: "Lars Marius Garshol" <larsga@garshol.priv.no>

> * ろ. 〇〇〇〇 ろ. 〇〇〇
> |
> | How about something having to do with hentaigana?
>
> Hentaigana? What are they? I tried Google, but couldn't really work it
> out.

変体仮名 [hentaigana]
U+5902 U+4F53 U+4EEE U+540D

Modern hiragana and katakana derive from certain styles of writing Chinese
hanzi.
http://okayama.cool.ne.jp/monjo/hentaigana.htm has a chart of some of these
characters and show which hanzi they come from. It's not a very complete
chart, though, but should suffice to get the point across. For example,
U+3042 (あ) comes from U+5B89 (安) in modern Japanese. But, until 1900, it
could have been written a number of different ways depending on which of the
following hanzi was used: U+963F, U+611B, U+4E9C, U+60AA (阿, 愛, 亜, 悪).
Another example would be U+306F (は) which comes from U+6CE2 (波) in modern
Japanese. In the past, other options came from U+8005, U+76E4, U+534A,
U+9817, U+516B, U+8449, U+5A46, U+82B3, U+7FBD, U+7834 (者, 盤, 半, 頗, 八,
葉, 婆, 芳, 羽, 破).
(On the same website, you can quiz yourself here:
http://okayama.cool.ne.jp/monjo/rensyu/mokuji.htm )

http://homepage2.nifty.com/Gat_Tin/kanji/kana.htm shows some pictures of
hentaigana used in modern Japan. The paragraph at the beginning roughly
reads: "Kana that have not been encoded into modern, public character sets.
They were used in elementary text books until MORI Arinori of the Ministry
of Education started the "system of screening school textbooks" in Meiji 19
[1886]. There was a problem in distinguishing them from cursive kanji, but
differed from kanji in that dakuten and handakuten [voicing marks; like "
and a circle] could be used. (etc)"

Reading old text in hentaigana is not very much fun. Many of the characters
can are easily be misread for others that look very similar. If anyone wants
some scans on text written this way, let me know; I have some books around
with pictures of the original texts. By the way, nearly all books with the
original texts for old documents in them are written using the modern kana
as very few people can actually read hentaigana (or maybe because they can't
be typed).

I'm rather new to Unicode. What is the rationale for excluding hentaigana
(at least I didn't notice them). Given, they are not in any of the major
Japanese encoding systems that I know of (except TRON, plane 6 or 12; don't
remember off hand), but classical texts for nearly a thousand years used
these characters. Is the argument that U+3042 (あ) is the same character
whether it derives from U+963F, U+611B, U+4E9C, or U+60AA (阿, 愛, 亜, 悪)
and thus deserves the same CP? And thus it is up to the font-maker to supply
these different forms? On the same page what corresponds to U+3042 can use a
character deriving from U+963F, U+60AA or others at the same time to
represent this character (often chosen by how the brush would flow and
connect with the next character). So, even if a font supplied one such old
form for U+3042, how are the others represented at the same time without a
different, unique CP? Use several dozen (if not more) different fonts? Now
that would be a nightmare... I understand that Unicode defines abstract
characters, not glyphs. But, I can hardly see how each hentaigana can be
viewed as the same abstract character corresponding to the modern
orthography. That would be like saying that, for example, U+6C49 (汉) and
U+6F22 (漢) are the same as for all other original --> simplied hanzi/kanji
(which are, of course, given unique code points). I must be missing the
"myth" being referred to. Would someone explain it to me?

Ben Monroe



This archive was generated by hypermail 2.1.2 : Mon Apr 08 2002 - 12:20:07 EDT