Re: locale-*independent* vi editor supporting UTF-8

From: odonnell@zk3.dec.com
Date: Tue Dec 08 1998 - 12:43:06 EST


Whoops; I meant to send this to the list as well as to Otto.

------- Forwarded Message

Return-Path: odonnell
Received: from quarry.zk3.dec.com by mailhub2.zk3.dec.com (5.65v4.0/1.1.10.5/24Sep96-0323PM)
        id AA17269; Tue, 8 Dec 1998 11:28:51 -0500
Received: from localhost by quarry.zk3.dec.com; (5.65v4.0/1.1.8.2/16Jan95-0946AM)
        id AA17098; Tue, 8 Dec 1998 11:28:44 -0500
Message-Id: <199812081628.AA17098@quarry.zk3.dec.com>
To: Otto Stolz <Otto.Stolz@uni-konstanz.de>
Cc: odonnell
Subject: Re: locale-*independent* vi editor supporting UTF-8
In-Reply-To: Your message of "Mon, 07 Dec 1998 02:04:57 PST."
             <9812071007.AA04219@unicode.org>
Date: Tue, 08 Dec 1998 11:28:43 -0500
From: odonnell
X-Mts: smtp

   ...
> Now, locales and encodings are two different things. POSIX.2,
> which defines the contents and syntax of locales, does not say
> anything about how characters are encoded, so it's perfectly fine
> to use UTF-8 as the encoding for any locale. And then for vi to
> operate under any UTF-8 locale.
   
   . . .
   Vi, as any other program, has to know about this encoding, in order to
   perform correctly; a classical, 8-bit based, Vi implementation would not
   even get the cursor position right, with UTF-8 encoded data. At the very
   least, Vi will have to take the UTF-8 mechanism into account when counting
   characters and calculating cursor movements. Hence, I cannot understand how
> It's easy to have a vi that processes UTF-8-encoded data.

Of course, many existing implementations need to be rewritten to
accommodate UTF-8. However, there are a lot of vi's that already
have been internationalized. They use locale information to
determine how to move the cursor, count characters, etc. For
an internationalized vi, it's "easy" to process UTF-8 because it's
just another code set. There's nothing in the vi spec, UTF-8, or
the locale model that prevents vi from being able to process UTF-8.

Remember, the original question I was answering was whether there
was a locale-*independent* version of vi that handled UTF-8. I was
trying to make the point that vi cannot be independent of locale.
It would be hard (probably impossible) to have a locale-independent
vi.
   
   In order to process data in various encodings, such as ISO 8859-1, UTF-8,
   and Unicode (UTF-16), a programm has to know about the encoding of the
   actual data. Hence, I cannot understand how a program, such as Vi, could
   work with a locale that does not cover the encoding.

I agree that some programs have to know how the data they're
processing is encoded. (Many don't have to know; they can
simply call i18n-sensitive functions, and those functions will
"do the right thing" for them.) Locales are one way to provide
the encoding information because they include language, territory,
and encoding. I didn't say vi could work with a locale that
doesn't cover the encoding; I don't know why you thought I did.

- -----------------------
Sandra Martin O'Donnell
odonnell@zk3.dec.com

------- End of Forwarded Message



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT