Re: Linux & Unicode

From: John Cowan (cowan@locke.ccil.org)
Date: Fri Dec 04 1998 - 10:09:59 EST


Markus Kuhn wrote:

> - There is some UTF-8 support in glibc2 and more is to come,
> but it is not documented and nobody knows how to use this in his
> applications. The free online "Easy Guide to UTF-8 for Unix C
> Programmers" still has to be written!

What can be done about getting uctype/uclib stuff into canonical
Linux distributions? This is stronger than just a UTF-8 locale
for ctype, because it recognizes Unicode-specific things like
punct vs. symbol, titlecase, etc. etc.
 
> - The only major UTF-8 capable application at the moment is the Yudit
> editor.

Also Sam and Wily, both of which I use and which are far more
powerful.

> I have been very keen on switching everything on my system to UTF-8
> since 1994, but I haven't done it yet. I will do it as soon as xterm,
> vi, emacs, exim, gcc, bash, and readline are ready for operating in a
> pure UTF-8 environment. We are getting closer, but we are not yet there
> and we have not yet reached critical mass.

I have a back-burner project to make UTF-8 versions of some of the
GNU text tools, notably tr and wc. This is more than just changing
the character width, e.g. the classic implementation of character
sets in tr doesn't scale.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:43 EDT