Re: hentaigana

From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Apr 08 2002 - 15:59:53 EDT


Lars Marius Garshol continued:

> This is helpful, but immediately raises a number of new questions.
> Earlier you said hentaigana were older forms of hiragana + katakana,
> but now it seems they developed at the same time. Or did this
> simplification simply lead to a set of characters that was later
> divided into hiragana, katakana, and the rest (loosely called
> hentaigana)?

You're still making category errors here.

The distinction to make is between standard kana (which includes
both hiragana and katakana), as established by the Meiji education
reform, and abnormal form kana (hentaigana), which are all the other
kana forms deprecated by the Meiji education reform.

There are hentai hiragana forms -- those shown on
http://okayama.cool.ne.jp/monjo/hentaigana.htm
are all hiragana forms, by the way. And there are also hentai
katakana forms -- piecewise extractions of kanji characters that
were used as katakana, but which were also deprecated and not among the
standard set of katakana used today.

Note that even in the standard collection of hiragana encoded
in Unicode today (and in JIS) there are a couple old kana forms
(U+3090 WI and U+3091 WE) that were dropped in the post-WWII reforms,
because they represent sounds that are no longer distinct in modern
Japanese. (They are pronounced just 'i' and 'e', respectively now.)
So they have dropped from modern orthography in favor of the
hiragana for I and E, and could be considered another, later
instance of hentaigana, even though they are not generally named
as such.

>
> Are there different hentaigana with the same sound?

Yes.

> Would you describe
> the hentaigana as a syllabary, or is it too chaotic for that?

It is not chaotic -- it just has lots of separate symbols for
the same sound. Yes, it is a syllabary.

Go back to the online chart that Ben pointed us to, and I'll walk
you through it.

Just look at the top row in the chart. That has a yellow cell on
the left hand side with HIRAGANA A in it (U+3042 in Unicode, so
you can look and verify). Then there are three entries across,
with a hiragana form (larger, above), and a kanji form (smaller,
in parentheses, below). The kanji are the original kanji alternatives
used for representing this A sound -- they are man'yoogana, as I
indicated before -- Chinese characters used just to represent the
sound. In modern Chinese, the three Chinese characters are pronounced
an1, a1 ~ e1, and e4, from left to right. In modern
Japanese, they are pronounced an, a, aku, respectively. (To really
understand the details, of course, you'd have to go back to
Middle Chinese and Old Japanese pronunciations.) In any case,
the three Chinese characters got picked up and used purely
phonetically to represent the A sound in Japanese. Then, in
written form, all three got drastically simplified in cursive
styles. Those drastic simplifications became conventionalized,
eventually, as kana, *distinct* from kanji.

What the Meiji educational reform did was look at listings like
this and picked the second column forms as *the* hiragana --
basically dispensing with all the other alternates as superfluous
and confusing.

By the way, the blue boxes in this chart are highlighted as
"frequent appearance" characters. The legend basically says you
can't read Heian period materials without knowing these forms,
both in their use as kana and as cursive forms of the kanji, so
"memorize them!"

This chart doesn't explain the similarly complex history for
how various representations for katakana syllables developed
from kanji pieces, and which ones of those got fixed during
the Meiji education reform. But there are charts of those, as well.

> * Lars Marius Garshol
> |
> | - are they just a different style of writing some other set of
> | characters (say hiragana), or characters in their own right?
>
> * Kenneth Whistler
> |
> | Again, that begs the question about encoding. It depends on how you
> | approach it.
>
> Well, let's put it this way, then: "in your personal opinion, would
> hentaigana be unified with the kana already in Unicode"?

Maybe. I don't know. Many of them are just cursive forms for
the Han characters -- and we certainly don't want to start down
the road of encoding separate characters for each cursive representation
of a Han character. That way lies madness.

But since the hentaigana are essentially a closed set of alternative
forms for kana, there are several options. They could simply be
enumerated as characters, if a usage case is made for them. Or they
could be treated as formal variants of the existing kana. Or they
could simply be treated as glyphs, to be handled by fonts.

>
> | As to whether hentaigana should be added to Unicode, and if so, how
> | many and in what relation to the existing kana -- I consider that an
> | open issue until someone actually goes to the effort to make a
> | proposal in sufficient detail to enable a technical discussion to
> | start.
>
> That's fair enough, but does this mean that it's really difficult to
> tell what the right solution would be, because the relationship
> between the characters is so close, or are you just being procedurally
> correct?

The case needs to be made. Not every historical form of every
character needs to be *encoded* as a character. The hentaigana
are an interesting edge case, since they saw usage not all that
long ago in Japan, but in a way that was prestandard for the
Japanese orthography. So yes, it is a technical issue -- not uncommon
for dealing with historical forms of writing systems -- and not just
a matter of me being procedurally correct.

--Ken



This archive was generated by hypermail 2.1.2 : Mon Apr 08 2002 - 16:51:26 EDT