> -----Original Message-----
> From: Marco.Cimarosti@icl.com [SMTP:Marco.Cimarosti@icl.com]
> Sent: Wednesday, November 24, 1999 12:37 PM
> To: Unicode List
> Subject: RE: Multilingual Documents [was: HTML forms and UTF-8]
[Hohberger, Clive P.] <snip>
> The fact is that "multilingual documents" have never been a problem, as
> far
> as all the involved languages share the same character set. The real
> problem
> is with *multi-script documents*, and I guess that this shrinks the ratio
> even more.
>
[Hohberger, Clive P.] <snip>
A classic "MULTI-SCRIPTING" problem arrises with Japanese documents,
particularly technical papers. As I'm sure everyone knows, aside
from the
Chinese characters (Kan-ji), Japanese writing also uses the phonetic
alphabets Hiragana for Japanese language words and Katakana for
foreign
words (words of non-Japanese origin) . In addition, there are often
Latin
and Greek characters used in equations and for technical
terminology.
Often phrases of English are embedded using Latin characters
(Roman-ji)
in the middle of Japanese text.
The Japanese Industrial Standard JIS-X-208 (last revised
1997)attempts
to accomodate this by including the basic Latin, Greek and Cyrillic
alphabets within the total character set. This (and the Shift-JIS
pseudo-
transformation) work fine as long as there are, for example, no
glyph
variants such as accented Latin characters. But trying to embed a
French, Swedish, Czech or Vietnamese phase is a nightmare in JIS
208 and Shift-JIS writing systems... when it can be done at all.
Basically its like me trying to embed Kanji in an English
document...
I usually do it by converting it to a graphic and embedding the
graphic.
One of the challenges will be to take full advantages of the
capabilities
of Unicode in multiscripting in Japanese word processing systems.
Clive
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT