RE: Korean line breaking rules : Unicode 3.0 (p. 124)

From: Marco.Cimarosti@icl.com
Date: Fri Mar 24 2000 - 05:10:49 EST


Jungshik Shin wrote:
> In Latin text, for instance, lines may be broken at spaces
> and syllabic
> boundaries with hyphen. On the other hand, in traditional/classical
> Chinese text(where space is rarely used as a delimeter),
> lines break
> at any ideograph boundaries. Yet another example is
> provided by modern
> Korean text where lines may break at space as well as at syllabic
> boundaries, which is almost identical to rules for Latin
> text (except
> that syllabic boundaries can be rather easily determined in Korean
> text compared with Latin text). PLUS something about Thai text....

I would recommend changing "Latin text" into "text written in Latin script";
to avoid confusion between Latin script and language.

Even better, I'd change it to "text written in Western scripts", because the
same line-breaking principles apply also to Cyrillic and Greek.

Moreover, I would be careful with that reference to "syllabic boundaries":
this is not true for many languages, where the meaning and/or structure of
words contribute to hyphenation rules.

        Example 1: English. The two syllables in "parking" are "par/king",
by the phonetic point of view. However, I believe that this word should be
hyphenated as "park-ing", or perhaps may not be hyphenated at all. Generally
speaking, English hyphenation is quite unpredictable (as it is her spelling)
and should be implemented with a dictionary approach.

        Example 2: Italian. The basic rule is to follow syllable boundaries,
but there is an exception with "s" followed by a consonant. E.g. the
syllables in my surname are "Ci/ma/ros/ti", but the hyphenation locations
are "Ci-ma-ro-sti".

        Example 3: German. Hyphenating a word may change its spelling, in
some cases. E.g., "Straße" and "Ecke" are hyphenated as "Stras-se" and
"Ek-ke", respectively.

So, I would change the phrase "may be broken at spaces and syllabic
boundaries with hyphen." into "may be broken at spaces, or in the body of
words with hyphen. The locations inside a word where hyphens may be applied
often correspond to syllable boundaries, but precise hyphenation rules are
strictly language-dependent."

Ciao.
        Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT