Re: [unicode] kJapaneseOn and kJapaneseKun Use What Romanization Standard?

From: Asmus Freytag (
Date: Mon Jan 25 2010 - 17:00:23 CST

  • Next message: John H. Jenkins: "Re: [unicode] kJapaneseOn and kJapaneseKun Use What Romanization Standard?"

    On 1/25/2010 2:10 PM, Christoph Päper wrote:
    > John H. Jenkins:
    >> The readings fields are intended to be primarily "what you would see
    >> if you looked this up in a dictionary," and secondarily "what you
    >> would type to input this character by itself."
    >> The fields for Mandarin, Cantonese, Korean, and Vietnamese, however,
    >> all use the transcription system preferred by native speakers.
    > Taking the close relationship of Unicode and ISO 10646 into account,
    > one would expect, naively perhaps, those transcription systems to be
    > selected from available ISO romanization standards.
    That's the kind or reasoning that gives ISO standards a bad name. If an
    ISO standard embodies a superior solution, it might make sense to use
    it, but that would have to be the case. Solely going by "brand
    preference" is probably not a good selection criterion.

    I am sure there are many other ISO standards for information
    presentation and data formats that the Unihan database could have
    followed, but didn't. In some sense, that's not ideal, because the
    format is rather ad-hoc. On the other hand, it freed the original
    authors/editors to focus on the contents and data collection.
    > In the case of Japanese this would mean 3602 (loose or strict
    > variant). This standard and similar ones, on the other hand, do not
    > employ 10646 either, but provide graphic character representations
    > only. (Admittedly, most of the ones I read have not been updated since
    > the turn of the century.)
    What one does expect, is that the Unihan data do not see a wholesale
    replacement. Such as replacing ASCII data with kana data. Even a
    wholesale correction, replacing the romanization scheme from one version
    of the database to the next could present problems for anyone who has
    written scripts or programs to utilize the data.

    Adding a new kana-based set of readings, on the other hand, would not
    cause compatibility problems. That should be the route to persue.


    This archive was generated by hypermail 2.1.5 : Mon Jan 25 2010 - 17:02:16 CST