RE: Unicode and right-to-left

From: Jonathan Rosenne (rosenne@qsm.co.il)
Date: Thu Jun 10 1999 - 07:01:29 EDT


I would have said that regarding bidi a VT100 emulator should emulate the
Hebrew and Arabic VT100. However, there were several different
implementations, all of them I consider obsolete.

I suggest that for Hebrew one should emulate the 8-bit Hebrew VT200. It came
before 6429. It is a "visual" implementation with several escape sequences.
Applications usually used a "form manager" to interface with the screen and
it handled all the necessary inversions.

I understand that the Arabic VT was "logical", so it would be different.

I would not recommend implementing 6429, since the Unicode implicit
algorithm is now prevalent. 6429 does mention implicit algorithms as an
option, but does not specify them. However, some Unix vendors had
implemented the 6429 explicit controls.

Jony

> -----Original Message-----
> From: Markus Kuhn [mailto:Markus.Kuhn@cl.cam.ac.uk]
> Sent: Thursday, June 10, 1999 10:53 AM
> To: Unicode List
> Subject: Re: unicode and right-to-left
>
>
> Matan Ninio wrote on 1999-06-09 23:27 UTC:
> > With all the impressive work that seems to go on in unicode and
> multilingual
> > support, I haven't heard of anyone mention right to left (Bidi)
> support. If i
> > remember the standard right, for full (or even lever 1) ISO
> 10646 support,
> > Bi-directional should be supported.
>
> I don't think the ISO 10646-1 standard talks a lot about bi-directional
> support. In contrast to Unicode, ISO 10646-1 is unly a character table
> without much added semantics.
>
> > I use Hebrew, and as such this aspect of unicode is very
> important to me (and
> > many other people around me.) is there any work done on this
> subject? Can I
> > help this effort by testing or even writing code?
>
> As far as Unicode for X11 and Linux is concerned:
>
> We have now UTF-8 support in xterm, the VT100 emulator that is most
> commonly used to talk to non-GUI applications. Xterm treats ISO 10646-1
> characters with exactly the same simple VT100 semantics as it treats ISO
> 8859-1 characters. The only new thing that UTF-8 support brought so far
> is that you can now have thousands of characters as opposed to only 194
> with ISO 8859-X. This is what you need to use Latin, Greek, Cyrillic,
> Georgian, Armenian, Mathematics, IPA, etc. simultaneously in emails,
> software source code, etc. Xterm currently does not do any bi-di
> support nor does it any combining characters or other rendering of
> presentation forms, and it can currently only support mono-spaced fonts.
> This limits the usefulness of the new UTF-8 support to probably a bit
> more than half of the scripts supported by Unicode.
>
> I have not yet been in contact with anyone who had a very clear and
> sufficiently simple vision of how right-to-left support for xterm should
> look like. There are at least two competing standards. Both ISO 6429 and
> Unicode specify their own right-to-left mechanisms, and it is unclear to
> me, which of these two standards is more appropriate in the context of a
> ISO 6429 subset compliant VT100 terminal emulator such as xterm. It is
> also not clear to me how the Unicode bidi algorithm, which I understands
> works on rendering a stream of characters during output, applies to a
> VT100 terminal with its numerous full-screen editing and cursor control
> capabilities. There is also the alternative option that the bidi support
> is done in the application software (which has to internally reverse
> Hebrew strings) and output on xterms is then done only in classical
> left-to-right mode. The advantage of this approach is that the author of
> the editor has more detailed control over every aspect of right-to-left
> support, the disadvantage is that right-to-left support will not be as
> persuasive as if it were done in the terminal emulator, because simple
> programs such as cat and ls are unlikely to be extended with bidi
> functionality.
>
> Only a very small fraction of Unix developers (namely those in Israel
> and Arabic countries) has a personal need for right-to-left support.
> Linux support is not market-driven, but itch-driven, that is whenever
> some developer is unhappy with something and feels the itch, then it
> will get fixed rather quickly.
>
> The best approach to get right-to-left support into Linux and X11 is to
> first of all educate a broad developer community about the vision of how
> in detail this should look like. For instance, should xterm be modified
> or will editors and other applications have to reverse the strings?
> Should ISO 6429 or Unicode bidi be used? How would this work in detail?
> The Unicode bidi standard was not written with VT100 terminal editing
> semantics in mind, and the bidi parts of ISO 6429 are not the most
> readable specification on the planet. Is either the Unicode or the ISO
> 6429 bidi semantics implemented in some existing VT100 terminal variant
> that is very popular among Unix users in Israel and that we simply
> should emulate? A well written paper or web page about these design
> issues would be a big step forward. And then you will have to contact
> numerous people and convince them that bidi support is an important and
> good things. You will probably also have to provide significant patches
> for code yourself, because non-commercial users who are not users of
> Hebrew or Arabic themselves might be sympathetic to your cause, but
> probably will not feel enough of an itch to implement it all alone.
>
> ISO 6429 = ECMA-48 is freely available from
>
> http://www.ecma.ch/
> ftp://ftp.ecma.ch/ecma-st/e048-pdf.pdf
>
> The Unicode bidi algorithm is on
>
> http://www.unicode.org/unicode/reports/tr9/
>
> I don't know any literature on existing practice with bidi under Unix
> and VT100.
>
> Markus
>
> --
> Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
> Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT