From: Jon Babcock (jon@kanji.com)
Date: Sat Mar 08 2003 - 09:03:35 EST
Yung-Fong Tang wrote:
>
>
> Ram Viswanadha wrote:
>
>> There is also some information at
>> http://oss.software.ibm.com/icu/docs/papers/binary_ordered_compression_for_unicode.html#Test_Results
>>
>> Not sure if this is what you are looking for.
>
> thanks. not really. I am not look into the ratio caused by encoding. But
> rather the ratio caused by language itself. For example, in order to
> communicate the idea "I want to eat chicken for dinner tonight", French,
> German using the same encoding may use different number of characters to
> communicate the same "IDEA".
"Efficency" here is dependent on the translation and varies
widely. (See example below.) That's why the practical experience
of professional translators will probably provide the best
answer. I have already mentioned what is, in my experience, the
range for contemporary Japanese-English and Chinese-English.
These ratios are important to JE and CE translators because we
usually get paid by the English word. But it usually takes more
work to use less words. So, if we don't want to be penalized for
using concise English, we try to charge by the character count
in the Chinese or Japanese source text. To quote a rate to our
clients, we must calculate what the "efficiency ratio" -- to
coin a term here -- is for our translations in this particular
field.
If you want to calculate this ratio yourself, I agree with your
idea of using Bible translations, although the number of proper
names may skew the results compared, for example, to technical
translations. But it woud be a good place to start.
One example, from thousands, found on yesterday's honyaku ML:
イメージ合成写真です --> 'simlulated photograph' or 'the
photograph shown is for illustration only" , i.e., from 21 to 45
characters in English, the target language. Decide how many
bytes you're going use to encode the Japanese and the English
strings here, and you'll get the "efficiency ratio" in this case.
Jon
-- Jon Babcock <jon@kanji.com>
This archive was generated by hypermail 2.1.5 : Sat Mar 08 2003 - 09:46:47 EST