Re: Unicode-based Cyrillic-Latin transliteration table

From: DougEwell2@cs.com
Date: Wed May 30 2001 - 11:54:08 EDT


In a message dated 2001-05-30 2:27:24 Pacific Daylight Time, keld@dkuug.dk
writes:

>> After reading these definitions, I see that I was mistaken in my
terminology.
>> Since I am looking for conversions that are geared toward English
speakers,
>> and explicitly favor solutions like "ch" for U+0427 instead of "tj" or
>> "tsch," the proper name for what I am trying to do is "transcription."
>
> I was on record advocating the transliteration term. Just for information,
> where do you see the differences here? Why is it transcription?

I am looking at "Scope and Purpose of ISO/TC46/SC2" located at
http://www.elot.gr/tc46sc2/purpose.html, the page to which Peter Constable
referred me. That page has the following definitions:

"The two basic methods of conversion of a system of writing are
transliteration and transcription.

"Transliteration is the process which consists of representing the characters
of an alphabetical or syllabic system of writing by the characters of a
conversion alphabet, this being the easiest way to ensure the complete and
unambiguous reversibility of the conversion alphabet in the converted system.

"Transcription is the process whereby the sounds of a given language are
noted by the system of signs of a conversion language. A transcription
system is of necessity based on the orthographical conventions of the
conversion language. Transcription is not strictly reversible."

In my original message I stated that one of my goals was to convert Cyrillic
text to "a 7-bit ASCII representation that an English speaker can reasonably
sound out." The spellings "ch," "tj," and "tsch" represent roughly the same
sound to speakers of English, Dutch, and German respectively. In choosing
"ch" in a case like this, I am explicitly choosing the orthographical
conventions of the English language.

Likewise, I stated that "round-tripping is not a goal; U+0428 + U+0427 and
U+0429 would both be expected to map to 'SHCH'." This fails the TC46/SC2
definition of transliteration, since that definition makes reversibility the
primary goal.

Finally, I wrote, "What I do want is something that generates a usable
pronunciation without
using digits or letters like Q for no purpose other than uniqueness." In
writing this, I had in mind an early computer-generated Russian-English
transliteration in a book I have somewhere, which used digits to represent
certain Russian sounds. I believe "Khrushchëv" came out something like
"Xru31v," with the X representing "kh" (fair enough) and 3 representing
"shch" and 1 representing "yo" ("ë"). These last two cases are what I am
trying to avoid. But a truly reversible transform requires arcane
substitutions such as this, or other tricks like combining overties (used in
the ALA scheme).

Note that I drew up my requirements for the project before I had any
awareness of the distinction made by TC46/SC2 between "transliteration" and
"transcription." Having seen the two terms defined, I concluded that my true
requirement is for transcription.

These definitions are the work of TC46/SC2 and I respect them. However, it
should be noted that numerous examples of my previous loose use of the term
"transliteration" can be found in abridged dictionaries and on the Web.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT