Re: vim and Arabic/Farsi support

From: Ed Trager (ed.trager@gmail.com)
Date: Mon Dec 03 2007 - 12:52:02 CST

  • Next message: Benjamin M Scarborough: "Re: Katakana Extended-A?"

    Hi, Jeroen,

    When I last checked (which was probably 2 years ago), FreeBSD and
    related BSD operating systems still lacked an internationalization
    framework. The Citrus project seeks to address this serious
    shortcoming in these OSes:

        http://citrus.bsdclub.org/

    I don't know how close the Citrus framework is to being usable, nor
    whether anyone else is pursuing a similar effort.

    However I do know that, when I last checked --again probably 2 years
    ago-- much of vim's internationalized feature set did not work
    correctly in the standard 'C' locale of a BSD environment. My
    particular setup at that time was an OpenBSD machine, mlterm, and vim.

    Therefore my first suspicion would not be PuTTY but rather VIM itself
    in the 'C' locale of FreeBSD.

    My solution on OpenBSD a few years ago was to use Yudit which of
    course requires X Windows. Yudit is not affected by LANG or LC_*
    environment variables and works on OpenBSD, FreeBSD, etc., just fine.
    Of course Yudit is a very different kind of editor than vim ...

    In your case, the fact that you are using PuTTY on Windows may obscure
    your ability to diagnose the origin of the problem. If you have
    physical access to the FreeBSD machine or, failing that, if you could
    export the X11 display of the FreeBSD machine and access it from a
    Linux machine (with LANG=en_US.UTF-8 set), then you could experiment
    with vim and Yudit or other editors (like mined
    http://towo.net/mined/mined.html) on FreeBSD.

    -- Ed

    On Dec 3, 2007 11:13 AM, Jeroen Ruigrok van der Werven
    <asmodai@in-nomine.org> wrote:
    > Hi Ed,
    >
    > [Please let me know if people deem this outside of Unicode's mailinglist
    > scope.]
    >
    > -On [20071203 16:44], Ed Trager (ed.trager@gmail.com) wrote:
    > > (1) Are you using vim with an RTL-capable terminal like mlterm?:
    > >
    > > http://mlterm.sourceforge.net/
    > >
    > >(2) Are you using vim in a UTF-8 locale?
    > >
    > > ~> LANG=en_US.UTF-8 mlterm &
    >
    > Actually I used it through PuTTY (using full Unicode translation) on a FreeBSD
    > machine using en_US.UTF-8 as well as gvim on Windows XP.
    > So in essence: FreeBSD + vim <> PuTTY on Windows XP
    >
    > The word in question is تغییر and when I copy it in Notepad or view the edited
    > file (in vim) with Firefox it shows up as it should, that is:
    >
    > u+062a ARABIC LETTER TEH
    > u+063a ARABIC LETTER GHAIN
    > u+06cc ARABIC LETTER FARSI YEH
    > u+06cc ARABIC LETTER FARSI YEH
    > u+0631 ARABIC LETTER REH
    >
    > Now, in vim, when I copy it through PuTTY and paste it in Notepad I get: ﺖﻏییﺭ
    > That is:
    >
    > u+fe96 ARABIC LETTER TEH FINAL FORM
    > u+fecf ARABIC LETTER GHAIN INITIAL FORM
    > u+06cc ARABIC LETTER FARSI YEH
    > u+06cc ARABIC LETTER FARSI YEH
    > u+fead ARABIC LETTER REH ISOLATED FORM
    >
    > I just noticed that when I copy from Firefox into gvim on Windows XP and then
    > copy from gvim to paste into Notepad it does the right thing.
    > So I guess that it might be PuTTY that's messing up something even though its
    > translation is set to UTF-8.
    >
    > I just wonder why these letters would get so strangely corrupted along the
    > way.
    >
    > --
    > Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
    > イェルーン ラウフロック ヴァン デル ウェルヴェン
    > http://www.in-nomine.org/ | http://www.rangaku.org/
    > Audi partem alteram...
    >



    This archive was generated by hypermail 2.1.5 : Mon Dec 03 2007 - 12:54:19 CST