Re: Transliteration

From: Mark Davis (mark@macchiato.com)
Date: Wed Aug 01 2001 - 13:20:44 EDT


Yes, you could use backup in that way, if you wanted. In that case, though,
it doesn't buy you much. Where it is more useful is for the kyo, gyo,...
case.

For those not familiar with Japanese, there are a large number of cases that
follow the same pattern: "kyo" maps to a large katakana for "ki" (き)
followed by a small katakana for "yo" (ょ). This can be done with a small
number of rules with the following pattern.

First, the ASCII punctuation mark "~" is used to mark forms that should
never occur in isolation. This is a general convention within the ICU rules
in any event.
  '~yu' > ゅ;
  '~ye' > ぇ;
  '~yo' > ょ;

Secondly, any syllables that use this pattern are broken into the first
hiragana, followed by letters which will form the small hiragana.

  by > び|'~y';
  ch > ち|'~y';
  dj > ぢ|'~y';
  gy > ぎ|'~y';
  j > じ|'~y';
  ky > き|'~y';
  my > み|'~y';
  ny > に|'~y';
  py > ぴ|'~y';
  ry > り|'~y';
  sh > し|'~y';

With these rules, "kyo" is first transformed into "き~yo". Since the "~y" is
then revisited, this produces the desired final result "きょ". Thus a small
number of rules (3 + 11 = 14) provide for a large number of cases. If all of
the combinations of rules were used instead, it would require 3 × 11 = 33
rules.

Mark

—————

πάντων μέτρον ἄνθρωπος — Πρωταγόρας
[http://www.macchiato.com]

----- Original Message -----
From: "てんどうりゅうじ" <11@onna.com>
To: <unicode@unicode.org>
Sent: Wednesday, August 01, 2001 07:52
Subject: RE: Transliteration

> I just saw the slides. That cursor-backup looks very tricky.
>
> So for someone doing the kana-to-Hepburn, you might have this: (here, "o^"
means o-with-circumflex)
>
> (bakayarou disclaimer: I make lots of errors)
>
>
> こ→k|お
> そ→s|お
> と→t|お
> .....
> おう→o^
> おお→o^
> お→o
>
> Is that it?
>
>
> <ruby><rb>じゅういっちゃん</rb><rp>(</rp><rt>Juuitchan</rt><rp>)</rp></ruby>
> Well, I guess what you say is true,
> I could never be the right kind of girl for you,
> I could never be your woman
> - White Town
>
>
> --- Original Message ---
> 差出人: Mark Davis <mark@macchiato.com>;
> 宛先: Unicore <unicore@unicode.org>;Unicode <unicode@unicode.org>;
> Cc:
> 日時: 01/08/01 12:52
> 件名: Transliteration
>
> >On http://www.macchiato.com/slides/transliteration_in_icu.ppt, I have
slides for my conference talk on transliteration. For those people having an
interest in transliteration, I would appreciate any feedback.
> >
> >Mark
> >
> >P.S. The slides are in PowerPoint. If someone is interested and can only
read an HTML version, I can generate it. BTW, Microsoft does not make it
obvious how to get the speaker notes -- you can only get them in full screen
mode: you have to right-click and choose "Full Screen", then right-click
again and choose "Speaker Notes".
> >
> >窶披身篆披身篆・>
> >マ織曠礇曠好沺Σ丱・・€凾タホシホュマ・∃ソホス
眈・スホクマ・珂織曠愁沺・燹Ε杰マ・珂・アホウマ狐∃アマ・>[http://www.macchiato.com]
> >
> >
>



This archive was generated by hypermail 2.1.2 : Wed Aug 01 2001 - 14:23:45 EDT