From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri May 14 2004 - 07:25:14 CDT
From: "Andrew C. West" <andrewcwest@alumni.princeton.edu>
(...)
> As has been stated time and time again, mixing vertical and horizontal textual
> orientation in the same document is beyond the scope of a plain text standard,
> and rendering mixed horizontal/vertical text is certainly beyond the ability
of
> any plain text editor that I know of. Markup is the appropriate way to deal
with
> mixed horizontal/vertical/diagonal/circular/spiral text (Artemis Fowl has a
> constructed "script" with spiral textual orientation), not dozens of new
> directional control characters.
What about boustrophedon? Isn't there also some vertical boustrophedon layout,
i.e. TTB and BTT alternated on each vertical row?
My opinion is that, whatever the directionality used, it does not matter. Bidi
character properties are only useful when handling local changes of
directionality within the same document, so that they require reordering when
rendering mixed scripts, before the final main directionality is applied (this
final main directionality could use horizontal/vertical rows or whatever, and
the direction of rows can be contant or alternated; it does not matter and this
is out of scope of Unicode encoding).
LRO/LRM/BDE controls and so on are to be used to override the main direction of
characters belonging to the same script, when they are used in contexts where
the main direction must be escaped. BiDi character properties are there to avoid
using these controls when they are not necessary. If something is not specified
in BiDi properties, then the characters will be laid out according to the (out
of Unicode scope) document directionality.
May be this should be clarified in the Unicode spec, so that these controls and
properties are defined in terms of "character direction" (the second "row
direction" will not be encoded, allowing boustrophedon or unidirectional
layouts), instead of just "left" and "right".
The wellknown exception to this directionality model is Hangul whose clusters
adopt a local horizontal/vertical for rendering their composite jamos in the
same syllable. If leading and trailing consonnants had not been encoded
separately, one would need to encode a special punctuation to mark syllable
boundaries. (This punctuation would not necessarily have a visible glyph, it
could be a thin space or an arrangement of the text layout, in the traditional
Hangul squares).
In Hangul, syllables breaks are marked by the layout, but not word breaks; in
Latin/Greek/Cyrillic/Hebrew this is the reverse, and I consider SPACE as a
punctuation; either word breaks or syllable breaks are needed to make the text
readable, i.e. less ambiguous, to reflect the speech and common semantics of
words where these breaks are often heard and needed too.
If this Hangul layout was better understood, and implemented as a layout
feature, one could easily see that Hangul is extremely simple and regular, and
has very few letters. (for example SSANGSIOS is currently encoded distinctly
from SIOS,SIOS, despite the two are identical semantically and should be
rendered identically, unless one is a trailing consonnant and the other a
leading one, in which case their separation is either marked in the layout by a
cluster boundery, or by an explicit punctuation which could as well be a thin
space character or a small dot mark).
The current encoding of Hangul ignores this feature, and makes handling Hangul
unnecessarily complicate, when all could be handled as a strict encoding of an
"horizontal" row of text, with a special layout to compose squares.
Square-layout does not seem mandatory in Hangul, and Koreans can also read text
rendered with uniform halfwidth and unidirectional jamos, making it a true
alphabet. Vertical presentation is also common for this, and readers that
already can read text horizontally or vertically would read without much
problems a boustrophedon layout, or featured layouts like spiral, circular,
provided that glyph orientation is kept recognizable.
I see the square layout only as the prefered layout for Koreans, as it fits well
with Han characters and with its long strong tradition for presentation. Han
ideographs also have a square layout of strokes. But they are a bit more complex
because they use many featured ligatures, so that strokes take some contextual
shapes depending on surrounding strokes and the number of strokes in the square
(these make the ideographs more readable, by a more uniform distribution of
blackness and stroke widths within the square, or by enhancing the symetries and
parallelisms). On the opposite, the limited set of letters (strokes) in Hangul
and the absence of overlays makes the rendering task much easier within squares.
This archive was generated by hypermail 2.1.5 : Fri May 14 2004 - 07:25:44 CDT