From: Mark E. Shoulson (mark@kli.org)
Date: Fri May 14 2004 - 13:25:50 CDT
E. Keown wrote:
> Elaine Keown
> Tucson
>
>Dear Peter,
>
>
>
>>>>*plain text* standard is the bidirectional
>>>>algorithm, which sorts out how a (horizontal)
>>>>*line* of text is laid out when text of opposite
>>>>directions
>>>>
>>>>
>
>In the 'old' Unicode 3.0 there was a one-line note on
>doing boustrophedon near the bidi material.
>Boustrophedon is needed not 'just' in Archaic Greek,
>but also in some periods of Egyptian and in some early
>Semitic stuff.
>
>For a small percentage of early Semitics stuff, it
>would be convenient to be able to automatically
>reverse the direction in a database, so the retrieval
>algorithm could look at 'both directions.'
>
That shouldn't be a problem, not even an issue. Remember, no matter
which direction the text runs on the page, Unicode text is stored in
logical order, not visual order. So a huge text that happens to be
rendered boustrophedon is still stored as a sequence of characters in
reading order. So you don't need to "reverse" the direction of anything
when you're searching. If you're looking for "herman", the letters will
be in exactly that order no matter which line of the text it wound up on.
>Is there a larger 'boustrophedon' note in Unicode 4.0?
> Is there any interest in expanding the bidi algorithm
>to definitely cover all possible RTL - LTR
>boustropheda (plural?) ?
>
Boustrophedon is probable outside the scope of unmarked Unicode. Which
is not as bad as it sounds. So far as a computer is concerned, text is
a stream of characters, in logical reading order. None of this silly
"lines" business, and reversing directions, even if some of the
characters are newline characters. That doesn't mean anything in terms
of how the data is stored. It's only when the data is *rendered* on a
screen or on paper that the bidi algorithm takes over and dictates where
to put the various marks. The bidi algorithm is enough of a headache as
it stands, just trying to deal with RTL and LTR scripts and their
possible coexistence on a single line. Boustrophedon is far too complex
for it. Probably what you'd do is have some higher-level markup tag
saying "Begin boustrophedon here..." which your renderer would know to
interpret properly: as it breaks the text into lines, reverse every
other one, etc etc... You'd have stuff like "<boust></boust>" tags or
something equivalent. The same goes for all various possible variants
of boustrophedon, and whatever other exotic directions happen.
>The discussion so far on the list doesn't appear to me
>to cover every possibility....my impression is that
>there are probably sub-varieties of boustrophedon and
>of the vertical material....sometimes individual
>characters get re-aligned, turned a certain number of
>degrees, and maybe sometimes they don't.
>
That's okay. Things like that are outside of plain Unicode's
capabilities. Other standards (XML stuff, whatever) need to be
developed to handle them.
~mark
This archive was generated by hypermail 2.1.5 : Fri May 14 2004 - 13:26:30 CDT