From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Jan 09 2008 - 17:54:27 CST
Waleed Oransa wrote:
> It's very important that the Unicode standard encode
> the original direction in the Bidi text. (...)
> The missing of a standard way to encode the directionality
> of the text (...)
I tend to disagree with those statements. Unicode already offers the proper
encoding for allowing all this, using Bidi embedding controls. They are
enough for the intended purpose.
If you mean a way to encode something that shoul apply to the WHOLE text,
without limitation, then you'll limit the usability of the text, for example
in quotations with mixed scripts.
BiDi embedding controls solve the problem cleanly. But it's still up to the
authors to use them when and where needed. The only alternative to this
solution would be to use them before each character, and this would be a
serious problem requiring all existing texts to be reencoded; the equivalent
"solution" would be to reenncode ALL the characters with mirroring or
dependant directionality. But the caveat would be that it would doublethe
encoding of all the existing texts, creating new forms of "equivalences"
that could have serious side effects if they were applied systematically.
I've not encountered any application where the simple addition of a single
embedding control was not enough to specify the correct ordering and
presentation of text, provided that they had the minimum needed to support
the existing BiDi algorithm (i.e. they need to accept the presence of these
controls, and not discard them or treat them as unknown characters displayed
with a "character missing" glyph. The incompatible applications anyway are
those designed only for basic Latin, and that were never internationalized
properly, or did not use any of the many i18n common libraries that have
been developed since long now, and integrated in almost all development
tools or runtime platforms. In most cases, even the simplest applications
can be recompiled without significant change, just by relinking them with
updated libraries so that they get the support of BiDi embedding controls.
The main issue that is more complicate to handle in application is the
layout of the GUI, however, this is a not related directly to the encoding
of text, but to user preferences. The text displayed in the GUI elements
should work properly even if they are not in a gui with RTL layout: they
appear as paragraphs within the layout, but the paragraphs are shown
correctly, even if they are not right-aligned (right alignment of the margin
is often possible in the application, including for LTR scripts, as a
presentation style option, if there's no global setting that can define this
style by default for the whole GUI layout. But even in this case, this is
NOT a problem of text encoding, and it's completely out of scope of Unicode
conformance rules.
This archive was generated by hypermail 2.1.5 : Wed Jan 09 2008 - 17:58:16 CST