Re: combining/fullwidth support for xterm

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Aug 16 1999 - 22:19:17 EDT


Markus,

You should make clear in your function headers what version of the standard
the data apply to. The current most up-to-date version is Unicode-3.0.0.beta.

I hope that you used the date files from the beta, not the text of the TRs
to create your list. (While we are editing version 3.0 some of the TRs have
not been edited, but the data files have been).

Second, characters with EastAsianWidth A may also well be wide in the given
application domain. A means (iswide returns either true or false depending
on other context information (i.e. language or locale id, knowledge of
ultimate data source or destination being an EA legacy character set etc.).

Finally, a small compiler that reads the data files and produces the source
code you showed, would be so much more useful as it would allow people to
update from the Unicode data base.

A./

At 04:18 PM 8/16/99 -0700, Markus Kuhn wrote:
>Kenneth Whistler wrote on 1999-08-16 22:51 UTC:
>> > 2) ls must know that combining characters do not occupy their own
>> > character cell
>>
>> Well, more correctly, that *non-spacing* characters do not. Those
>> are a subset of all combining characters in Unicode--many of which
>> are actually spacing characters.
>>
>> > 3) ls must know that characters with the East Asian Wide of FullWidth
>> > property (see TR #7) occupy two character cells.
>>
>> That's TR #11, not TR #7.
>
>Thanks for the corrections.
>
>By the way, below follow two C functions, that test whether a Unicode
>character is in one of these two classes (non-spacing or EastAsian Wide/
>FullWidth). With these functions, host applications should again be able
>to predict nicely how many cells a character consumes on a Unicode
>enhanced VT100 terminal such as some future xterm/kermit/Linux_console
>version.
>
>It would be nice to have something like these in glibc and similar
>libraries. They could also be the basis for implementing the column
>width functionality mentioned in section H.14 of ISO C (1990)
>Amendment 1 (1995), that is the "%#N" formatting code in printf
>that causes "%n" to report character-cells counts and not character
>counts.
>
>Markus
>
>P.S.: The attached code is in the public domain. Share, use, and enjoy.
>
>--
>Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
>Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
>
>
>Attachment Converted: "g:\apps\eudora\attach\iswide.c"
>
>Attachment Converted: "g:\apps\eudora\attach\iscombining.c"
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT