Re: Terminal Graphics Draft 2

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Thu Oct 08 1998 - 16:49:57 EDT


Rick McGowan wrote:

> Frank -- I reviewed the latest draft and have more comments...
>
I appreciate it, thanks.

> I do have a problem generally with adding "picture" characters to
> correspond to existing things that are Unicode-specific.
> ...
> So, it's my opinion that we should note the possible use, and move on.
> I.e., don't propose adding them now on the off chance the *someone* might
> have a user for them. Wait for a use.
>
Fine with me!

> Table 5.2 is particularly valuable information of the "here's what exists"
> variety... and given the widespread use of ISO-6249 controls, it is probably
> worth adding these. You also say in the notes "ISO-6428". Is that
> different from 6429? Or just a typo?
>
It's a typo; thanks for spotting it.

> > 5.3. EBCDIC Control Pictures
>
> Likewise, this is valuable information. It would be good to somehow call
> out the proposed additions, perhaps by putting an asterisk before or after
> the names. Because they're in EBCDIC order I found it a bit hard to discern
> precisely which are proposed additions.
>
I suppose the proposal is rather dense -- the inevitabal tug-of-war between
saying everything everywhere, thus making it so long nobody will read it, or
presuming it is read from top to bottom so everything is explained in advance
but must be remembered (the topological sort). The marking of new additions
is in the left ("Code") column. If the code is Exxx it is to be added;
otherwise it is already in Unicode (usually in the U+2xxx's).

But OK, I'll try to highlight them better.

> Someone from IBM should look at the 3270 stuff... I suppose someone will
> do so.
>
I was hoping for some feedback from the IBM mainframe camp too; not just
3270 users, but also those who analyze and debug 3270 data streams. If any
readers happen to know people outside this group who might be interested,
please feel free to forward the proposal to them.

> Another thing that should be discussed is when adding "symbol for foo" one
> should also add "foo" itself. For instance, there is no "Start of Field"
> control character; but a picture of it is being proposed. Probably UTC
> needs to hash through *that* issue...
>
Oh what a tangled web we weave... I think in this case we have an exception
to the rule. I think we can say that Unicode is ISO/ASCII based rather than
EBCDIC based. The structure of U+0000 through U+00FF is identical with
ASCII (= ISO 646 International Reference Version) + ISO 8859-1, with the
layout of ISO 4873 (C0, GL, C1, GR). The C0 control set is, indeed, the
ASCII C0 set (and that of ISO 646; ISO Registry number #001). Granted, the
C1 area is left unspecified, but what else could it be but that of ISO 6429?

I think it would be pretty weird (note: this is how we spell "weird" this
week...) to add EBCDIC controls to an ISO/ASCII based character set.

Personally, I'd rather leave them out and use the positions they would
occupy for something more useful. But the *symbols* for them do need
encoding, since we will be using Unicode-based software to analyze EBCDIC
and/or 3270 data streams (wire bearing EBCDIC comes into PC, which uses
Unicode internally). However, I would heartily welcome review by IBM or
other EBCDIC/3270-centric party of the specific repertoire of glyphs in the
proposal.

> You should look at the glyph pieces in the Adobe Symbol font, which is a
> widely used font. Many of these are contained in the Symbol font (0xE6 to
> 0xFE inclusive).
>
All the more reason to add them to Unicode. Another, as Kent Karlsson
pointed out earlier today, is that they are used in TeX (see the original
TeX and METAFONT book, p.175: TeX Standard Extension Fonts).

> I believe the following two characters are just masculine and feminine
> ordinal indicators, and are already encoded between 0x80 and 0xFF, as part
> of ISO Latin 1. They are probably just variant glyphs... unless the
> documentation distinguishes them and they occur in pairs with lower-case.
> Do you mean "small" or "capital"? Or are they really different?
>
> > E0B3 Latin small letter a with underbar SNI Math 04/04 (2)
> > E0B4 Latin capital letter O with underbar SNI Math 04/09 (2)
>
Well, "small" means lowercase; "capital" actually means "big" -- who can
tell with an "O"! Hopefully I'll be able to post GIFs of scanned pages
soon; that'll be a day's work! The reason these need to be encoded
separately from feminine/masculine ordinals are their size -- they fill the
whole cell, like a regular letter. Since terminal emulators and data
analyzers use fixed-pitch fonts, we can't just switch to another point size
to display these characters, since that will wreck the matrix arrangement
of the screen.

> By the way, I'm opposed quite strongly to adding the 256 "hex bytes" under
> any circumstances. Good thing they're an indepenedent proposal.
>
I certainly would not want to see them hold up the rest.

> The total proposed, including Hex Bytes is 448. Without Hex Bytes, it's a
> modest 192, and I think it could be reduced with a little more unification.
> Of course reduction will offset the expected increase due to other
> terminals clamoring to be included...
>
Yes, I see the mobs starting to form on the street below, waving placards
emblazoned with vertical lightnings with solidi; diagonal lightnings with
horizontal bars; European no-parking signs; Canadian moose-crossing signs...

Seriously, the hex bytes are entirely separable from the rest. I'll be
glad to cut them loose unless somebody speaks up strongly in their favor.

Thanks again!

- Frank



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT