Re: locale-independent vi editor supporting UTF-8

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Mon Dec 07 1998 - 05:00:56 EST

Next message: Markus Kuhn: "Re: Unicode repertoire of X11 fonts"
Previous message: Erik van der Poel: "Re: Unicode repertoire of X11 fonts"
Maybe in reply to: Jungshik Shin: "locale-*independent* vi editor supporting UTF-8"
Next in thread: John Fieber: "Re: locale-*independent* vi editor supporting UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Am 1998-12-03 um 7:46 h hat odonnell@zk3.dec.com geschrieben:
> According to the XPG4 vi man page, the current
> locale controls many aspects of vi's behavior, including the
> way strings are parsed into characters,
...
> Now, locales and encodings are two different things. POSIX.2,
> which defines the contents and syntax of locales, does not say
> anything about how characters are encoded, so it's perfectly fine
> to use UTF-8 as the encoding for any locale. And then for vi to
> operate under any UTF-8 locale.

UTF-8 (cf. <http://czyborra.com/utf/#UTF-8>) uses 1 through 3 bytes per BMP
character (1 through 4 bytes per Unicode character). In order to "parse
strings into characters", the processing program must undo the UTF-8
encoding. A program based on the 1-byte-amounts-to-one-character model
will not be able to sensibly handle UTF-8 encoded data.

Vi, as any other program, has to know about this encoding, in order to
perform correctly; a classical, 8-bit based, Vi implementation would not
even get the cursor position right, with UTF-8 encoded data. At the very
least, Vi will have to take the UTF-8 mechanism into account when counting
characters and calculating cursor movements. Hence, I cannot understand how
> It's easy to have a vi that processes UTF-8-encoded data.

In order to process data in various encodings, such as ISO 8859-1, UTF-8,
and Unicode (UTF-16), a programm has to know about the encoding of the
actual data. Hence, I cannot understand how a program, such as Vi, could
work with a locale that does not cover the encoding.

Please, explain.

Best wishes,
Otto Stolz

Next message: Markus Kuhn: "Re: Unicode repertoire of X11 fonts"
Previous message: Erik van der Poel: "Re: Unicode repertoire of X11 fonts"
Maybe in reply to: Jungshik Shin: "locale-*independent* vi editor supporting UTF-8"
Next in thread: John Fieber: "Re: locale-*independent* vi editor supporting UTF-8"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT

Re: locale-*independent* vi editor supporting UTF-8

Re: locale-independent vi editor supporting UTF-8