Terminal Graphics Draft 2

From: Frank da Cruz (fdc@watsun.cc.columbia.edu)
Date: Wed Oct 07 1998 - 20:29:02 EDT


TERMINAL GRAPHICS FOR UNICODE

  Frank da Cruz
  The Kermit Project
  Columbia University
  New York City USA
  fdc@columbia.edu
  http://www.columbia.edu/kermit/

  D R A F T # 2

  Wed Oct 7 18:32:05 1998

THIS IS A PREFORMATTED PLAIN-TEXT ASCII DOCUMENT. IT IS DESIGNED TO BE
VIEWED AS-IS IN A FIXED-PITCH FONT. ITS WIDEST LINE IS 79 COLUMNS. IT
CONTAINS NO TABS. IF IT LOOKS MESSY TO YOU, PLEASE FEEL FREE TO PICK UP
A CLEAN COPY AT:

  ftp://kermit.columbia.edu/kermit/charsets/ucsterminal.txt

Previous drafts are at:

  ftp://kermit.columbia.edu/kermit/charsets/ucsterminal_nn.txt

where nn is the draft number, e.g. "01".

ABSTRACT

A selection of terminal graphics characters is proposed for Unicode [24]
and ISO 10646 [19] to allow Unicode-based terminal emulation software to
(a) display glyphs that are found on popular types of terminals but
currently are not available in Unicode, (b) debug terminal and other data
streams, and (c) interoperate with other Unicode applications.

CONTENTS

    1. Introduction
    2. Scope
    3. Organization
    4. Hex Bytes
    5. Graphic Representation of Control Characters
    6. Math Symbols
    7. Line and Box Drawing Characters
    8. Unfinished Business
    9. Summary of Proposed Additional Characters
   10. References

Tables:

  5.0. Unicode Control Characters
  5.1. C0 Control Characters
  5.2. C1 Control Characters
  5.3. EBCDIC Control Characters
  5.3A. Obsolete EBCDIC Control Characters
  5.4. 3270 Control Characters
  5.5. 3270 Terminal Operator Status Indicators
  5.6. Additional Control-Like Pictures
  6.1. Math Symbols for Terminals
  7.1. Additional Line, Box, and Block Characters
  9.1. Census of New Characters

Figures:

  4.1. Control Picture Display
  5.1. Hex Byte Pictures
  5.5. Connected Rectangles
  7.1. "Framus" Glyphs

Grateful acknowledgements to those whose comments on the first draft are
reflected in the second: Kevin Bracey, Asmus Freytag, Tony Harminc, Elliotte
Rusty Harold, Paul Keinanen, Karlsson Kent, Rick McGowan, Kenneth Whistler.

1. INTRODUCTION

Terminal-host communication was the dominant form of interaction between
human and computer from about 1974 (when CRTs became affordable)(1) to about
1994 (when the Web and Windows took over the mass market). Terminal-host
communication is still widespread, especially in large organizations, and
is expected to remain so for decades to come, playing an important part in
organizations like universities, hospitals, government agencies, and
corporations with central computing facilities, for use in applications
ranging from sofware development and system/network administration, to email
and text-based Web access, to data entry and inquiry, to transaction
processing.

A terminal, for purposes of this document, is a device for entry and display
of text in a fixed-pitch font on a screen (or on paper) in which graphic
characters are displayed as glyph images in rows and columns of fixed size
"cells", one glyph image per cell. Terminals generally display (or
otherwise handle) the characters of ASCII [1] or EBCDIC [13], and often also
accented or non-Roman letters (or ideograms), and often also "graphic" (2)
(non-alphabetic, non-digit, non-punctuation) characters for purposes of
line- and box-drawing, mathematics, or other special effects.

In recent years, physical terminals have largely disappeared from the scene,
their functions subsumed into PCs running terminal-emulation software
alongside other applications. Unicode has effectively met the need for
encoding the earth's writing systems, but it is not well suited to terminal
emulation since it lacks some of the required graphics characters.

Without a standard encoding for the missing glyphs, each maker of terminal
emulation software must create or contract for custom fonts with private
encodings. Such fonts are not compatible with other (otherwise compatible)
fonts on the same platform (e.g. when copying from a terminal window and
pasting to a word processor), nor with each other. Furthermore, should
Unicode printers become standard equipment on PCs, terminal graphics
characters will not print correctly on them.

Meanwhile, in the interest of "show[ing] the presence of ... control codes
and the SPACE unequivocally when data is displayed" [24,p.6-84], Unicode
includes a selection of control pictures. Makers (and supportors, and
users) of terminal emulators and most other types of software could use this
feature of Unicode to better advantage if it were extended to cover a
greater portion of the "control space", or even to allow pictorial
representation of any code at all.

This document proposes a repertoire of terminal graphics and debugging
characters to be added to Unicode and ISO 10646 to which all makers of
fonts, code pages, and printers can refer when designing their products, and
upon which all makers of terminal emulation and/or debugging software can
base their screen displays.

Notes:
 (1) Strictly speaking, terminals predate electronic computers by some
     decades; the Teletype (used as the control terminal on many mainframes
     and most minicomputers in the 1950s through 1970s) dates back to 1929.
 (2) Note the distinction between "graphic" meaning "printing" (as in
     "ISO 8859-1 is a graphic character set") versus "graphics" meaning
     having something to do with pictures.

2. SCOPE

This document represents a survey of the following terminals:

  Digital Equipment Corporation VT100 through VT520 [3-9]
  Heath / Zenith 19 [10]
  Hewlett Packard HP-2621 and HP-2648 [11,12]]
  IBM 3164 and 3270 [15,16,27]
  Siemens Nixdorf 97801 [21]
  Televideo 922 and 965 [22,23]
  Wyse 60 and 370 [25,26]

as well as:

  IBM PC code page 437 [14]

which is the basis for numerous PC-oriented so-called ANSI emulations.

2.1. Problems

Even within this fairly narrow scope, arriving at a sufficient set of
character-cell terminal graphics for Unicode is complicated by the
well-known problems that affect other preexisting character sets to varying
degrees:

 1. Lack of official names for the characters of some of the sets.
 2. Lack of definitive, high-quality pictures of the glyphs in some cases.
 3. Lack of descriptions of the purpose and intended use of the glyphs.
 4. Lack of a current registration authority or owner in some cases.
 5. Questions of unification of glyphs from different terminal makers.
 6. End-user demand for specific characters or sets.

The issue of unification is complicated by the fact that some of the
terminal graphics characters are designed to join at cell boundaries to form
"pictures" (such as boxes or forms to be filled out) or large characters
(such as big math symbols) spanning multiple rows and/or columns. The
relationship of similar-looking glyphs for different terminals is difficult
to determine -- e.g. exactly where does a line touch an edge, and at what
angle, and does it make a difference?

2.2. What This Proposal Does Not Contain

This proposal does not require any action for well-known terminal
presentation forms such as double-high and/or double-wide characters, bold,
blinking, inverse, underlining, color, etc, since these are not encoding
issues. In particular, no special code points are needed for double-high or
double-wide characters, such as those seen on the DEC VT100 family of
terminals, nor for compressed characters as seen on Data General and DEC
terminals.

This proposal also does not cover true graphics terminals, such as Tektronix
vector graphics units, DEC ReGIS or Sixel graphics, etc, since these
graphics regimes are not character-cell based.

No attempt was made to account for the many Viewdata, Videotex, Teletex,
Minitel, NAPLPS, or other mosaic graphics character sets. These should be
tackled, if at all, by someone who knows something about them.

Note that the graphic characters listed in this proposal rarely, if ever,
appear on keyboard key labels. In general, these characters are never
typed, not even on real terminals, but are displayed when the terminal is
commanded into a special mode; for example, with ISO 2022 [17] character-set
designation and invocation escape sequences.

3. ORGANIZATION

This proposal groups terminal graphic characters into four major categories.
Some categories are complete by definition (e.g. the 2-nibble hex codes, of
which there can be only 256), but others should include space for expansion
as new glyphs are discovered or needed. The categories are:

Debugging Tools
  Graphical single-cell representation of Unicode, C0, C1, EBCDIC, and other
  control characters; hexadecimal dumps of terminal traffic: Sections 4
  and 5.

Math Symbols
  Although most math symbols found on terminals are already in Unicode,
  certain terminal-based applications rely on the ability to construct large
  symbols (integral and summation signs, braces, brackets) from smaller
  character-cell-sized pieces. Section 6.

Line, Box, and Block Drawing
  Used for data entry, transaction processing, forms filling, etc, in
  markets ranging from car rental and airline reservations, to 911
  operators, to medical information systems, to online library catalogs.
  Although Unicode does include a basic set (mainly those as U+2500), some
  others are missing. Section 7.

Each category is important for terminal emulation, but the categories can
be considered separately. The debugging tools category is not specific to
terminal emulation, but can be used with a wide variety of applications:
file analyzers, data or protocol analyzers, or for debugging of Web pages,
word processor documents, etc.

3.1. Temporary Reference Code Assignments

The characters proposed in this document are assigned temporary Unicode
values from the Private Use area, strictly for reference within (or to)
this document only. Final values should be assigned out of the Private
Use range. The temporary allocations are:

  E000-E08F Control Pictures
  E0A0-E0B8 Math Symbols
  E0D0-E0EF Line and Box Drawing
  E100-E1FF Hex Bytes

For a total of 512 positions, not fully populated. Obviously the final
counts, code values, and block allocations, including reserved positions,
are likely to change as this proposal evolves.

3.1. Character Properties

All new characters proposed in this document should be precomposed, since no
terminals (with the exception of certain APL and ALA terminals) are capable
of composing characters on the fly from nonspacing diacritics or by
overstriking. All proposed characters have Combining Class 0 (although
some of the characters are designed to "combine" (connect) with other
characters in adjacent cells).

No "Letter" characters are proposed, therefore none of the proposed
additions has the Case property. All proposed characters are
strong left-to-right as to directionality, the same as existing characters
in the same categories (box drawing, control pictures, etc). None of the
proposed characters has the Numeric Value Property, although it might be
tempting to assign it to Hex Bytes (see Section 4, Note 4).

Many of the proposed box-drawing and math-technical characters have the
Mirrored Property; this should be rather obvious when its name or
description contains the word "left", "right", "top", or "bottom".

I would venture that the proposed math symbols would have the Mathematical
Property, including the extensible ones, since the current Integral Top
and Bottom at U+2320, U+2321 have this property [24,Section 1.9].

4. HEX BYTES

Hexadecimal byte values, 2 hex digits each, allow any 8-bit byte to be
displayed in hexadecimal in a single character cell (and therefore allow any
Unicode character value to be displayed in two cells), for hex debugging in
terminal emulators, line monitors, protocol analyzers, word processors,
"dump" programs, Web browsers, etc. To prevent cell-boundary ambiguity,
the font designer should employ some visual device to bind the two hex
digits together in an unmistakable way, for example by arranging them
diagonally within the character cell as shown in Figure 5.1:

Figure 4.1: Hex Byte Pictures

 +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+--+
 |0 | |0 | |0 | ... |0 | |1 | |1 | |1 | ... |E | |F | ... |F |F |
 | 1| | 2| | 3| | F| | 0| | 1| | 2| | F| | 0| | E| F|
 +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+ +--+--+

One glyph is required for each hex byte code 00 through FF, or 256 glyphs in
all, as shown in Table 4.1, in which the "Code" column shows the temporary
reference value for this document. Ideally, however, the final 8 bits of
the actual code would correspond to the 8-bit value represented by the
corresponding glyph.

Table 4.1: Hex Byte Characters

  Code Byte Description
  E100 00 Symbol for Hex Byte 00
  E101 01 Symbol for Hex Byte 01
  : : :
  E1FF FF Symbol for Hex Byte FF

Notes:
 (1) The proposal for hex byte symbols is independent of the other proposals
     within this document; however, several hex byte symbols are required
     for C1 control pictures (Section 5.2) in any case.
 (2) The SNI "IBM" character set [21] contains glyphs for 01 through 1F,
     which are shown sideways (rather than upright diagonal). I see no
     reason to encode these separately; others might disagree.
 (3) Hex bytes values can collide with control-character names: FF, D1,
     D2, D3, D4, etc (Section 5). If both hex bytes and control pictures are
     implemented, the font designer should ensure they are distinct enough
     visually that they will not be confused.
 (4) Should these symbols have the Numeric Value Property? I think not,
     since, unlike digits, Roman numerals, etc, they are not normally used
     as numbers, nor to write numbers.

Summary:
  256 new characters, U+E100 through U+E1FF.

5. GRAPHIC REPRESENTATION OF CONTROL CHARACTERS

Digital VT220 and higher terminals, as well as Televideo, Wyse, HP, Perkin
Elmer, and other models, allow the user to select whether control characters
are acted upon or displayed graphically. Unicode itself includes its own
"control characters" such as line and page separators, directionality
controls, etc.

Normally control characters are used to affect the format and presentation
of glyphs on the screen. In "display controls", "transparent", or "debug"
mode (the terminology varies with the terminal vendor), control characters
are shown graphically rather than performing their normal functions; this
allows analysis and debugging of the host-terminal data stream using a
terminal, emulator, protocol analyzer, or line monitor. It also allows a
more readable form of file dumping and analysis.

A block of control pictures is already found in Unicode at U+2400, but:

 a. The illustrations in the Unicode book do not look like the control
    pictures that are actually used on terminals;

 b. They are for C0 only; there is no corresponding set of C1 control
    pictures;

 c. There are no pictures for the control characters unique to EBCDIC.

 d. Certain other terminal-specific control pictures are missing.

A control picture allows the user to unequivocally determine the identity
and position of control characters in the data stream by displaying each
control chraracter as a unique (and mnemonic) glyph in a single terminal
screen cell.

Terminals do this by arranging the letters (or letter-digit combinations) of
the official abbreviation for the control character in diagonally from upper
left to lower right, as shown in Figure 5.1.

Figure 5.1: Control Picture Display

 +---+ +---+
 |L | |D | (except the two-character abbreviation appears on the
 | | | C | screen with the characters closer together)
 | F| | 1|
 +---+ +---+

The Unicode illustration for control pictures at U+2400, however, depicts
the abbreviations horizontally. While the description of this block
[24,p.6-84] states that "only the semantic is encoded... a particular
application [can] use the graphic representation it prefers," a horizontal
arrangement is chosen in the illustration (on p.7-188) for all characters
except NL. But if they are implemented this way in a real font, it would be
very difficult for the user to discern the boundary between one control
picture and the next.

It is suggested, therefore, that that next edition of the Unicode Standard
illustrate these characters with the diagonal representation shown in Figure
5.1 (and in ISO 10646 [19]), since it is more likely that Unicode font
designers will follow the illustrations in the Unicode Standard than attempt
to procure the actual terminals or manuals to see how they do it.

5.0. Unicode Control Pictures

Table 5.0 lists the nonprinting Unicode characters used for spacing,
directionality control, and general formatting. These characters are in
the U+2000 block, and are indicated by mnemonics inside broken-line squares.

The Code column contains the temporary code value for the proposed symbol.
The Val column contains the Unicode value of the character for which the
symbolic representation is proposed. The Name column contains the
desginator shown in the broken-line square in the Unicode code table, with
a space standing for a line break (but see Note 2).

The suggested glyphs are those shown in the Unicode Standard.

Table 5.0: Unicode Control Characters

  Code Val Name Description
  E000 2000 NQ SP Symbol for En Quad
  E001 2001 MQ SP Symbol for Em Quad
  E002 2002 EN SP Symbol for En Space
  E003 2003 EM SP Symbol for Em Space
  E004 2004 3/M SP Symbol for Three-Per-Em-Space
  E005 2005 4/M SP Symbol for Four-Per-Em-Space
  E006 2006 6/M SP Symbol for Six-Per-Em-Space
  E007 2007 F SP Symbol for Figure Space
  E008 2008 P SP Symbol for Punctuation Space
  E009 2009 TH SP Symbol for Thin Space
  E00A 200A H SP Symbol for Hair Space
  E00B 200B ZW SP Symbol for Zero-Width Space
  E00C 200C ZW NJ Symbol for Zero-Width Non-Joiner
  E00D 200D ZW J Symbol for Zero-Width Joiner
  E00E 200E LRM Symbol for Left-to-Right Mark
  E00F 200F RLM Symbol for Right-to-Left Mark
  E010 2028 L SEP Symbol for Line Separator
  E011 2029 P SEP Symbol for Paragraph Separator
  E012 202A LRE Symbol for Left-to-Right Embedding
  E013 202B RLE Symbol for Right-to-Left Embedding
  E014 202C PDF Symbol for Pop Directional Formatting
  E015 202D LRO Symbol for Left-to-Right Override
  E016 202E RLO Symbol for Right-to-Left Override
  E017 206A I SS Symbol for Inhibit Symmetric Swapping
  E018 206B A SS Symbol for Activate Symmetric Swapping
  E019 206C I AFS Symbol for Inhibit Arabic Form Shaping
  E01A 206D A AFS Symbol for Activate Arabic Form Shaping
  E01B 206E NA DS Symbol for National Digit Shapes
  E01C 206F NO DS Symbol for Nominal Digit Shapes
  E01D FEFF ZWN BSP Symbol for Zero Width No Break Space
  E01E FFFE FF FE Symbol for Not A Character (Byte Order) (2)
  E01F FFFE FF FF Symbol for Not A Character (2)

Notes:
 (1) There is no known need for these symbols when emulating current
     terminals. In the future, if/when terminals are based on Unicode, they
     might be useful in that context. In the meantime, makers of word
     processors, Web browsers, etc, might have a use for these glyphs.
 (2) No mnemonic or abbreviation is given for this "not-a-character" in
     the Unicode Standard.

Summary:
  32 characters, E0000-E01F.

5.1. C0 Control Pictures

Table 5.1 lists the C0 Control Characters from the ASCII Standard [1] (and
also in ISO 646 and ISO 6429). Each C0 control character has an official
designator (from the appropriate ANSI [1] or ISO [18] standard): a 2- or
3-character sequence of (ASCII) alphanumeric characters.

In some terminals, such as the DEC VT220 family [5], the control picture
shows the designation in full. In others, such as Televideo [22,23], HP
[11], and Perkin Elmer [20], each 3-character designator is replaced by
a 2-character short form.

The columns are as follows:

  Code: The Unicode value in hexadecimal.
  Val: The value of the control character's code in hexadecimal.
  Name: The full ASCII abbreviation for the control character's name.
  2X: The 2-character abbreviation used on Televideo, HP, etc.
  Description: "Symbol for" followed by the character's standard name.

Table 5.1: C0 Control Characters

  Code Val Name 2X Description
  2400 00 NUL NU Symbol for Null
  2401 01 SOH SH Symbol for Start of Heading
  2402 02 STX SX Symbol for Start of Text
  2403 03 ETX EX Symbol for End of Text
  2404 04 EOT ET Symbol for End of Transmission
  2405 05 ENQ EQ Symbol for Enquiry
  2406 06 ACK AK Symbol for Acknowledge
  2407 07 BEL BL Symbol for Bell
  2409 09 BS BS Symbol for Backspace
  2409 09 HT HT Symbol for Horizontal Tab
  240A 0A LF LF Symbol for Line Feed
  240B 0B VT VT Symbol for Vertical Tab
  240C 0C FF FF Symbol for Form Feed (1)
  240D 0D CR CR Symbol for Carriage Return
  240E 0E SO SO Symbol for Shift Out
  240F 0F SI SI Symbol for Shift In
  2410 10 DLE DL Symbol for Data Link Escape
  2411 11 DC1 D1 Symbol for Device Control 1 (1)
  2412 12 DC2 D2 Symbol for Device Control 2 (1)
  2413 13 DC3 D3 Symbol for Device Control 3 (1)
  2414 14 DC4 D4 Symbol for Device Control 4 (1)
  2415 15 NAK NK Symbol for Negative Acknowledge
  2416 16 SYN SY Symbol for Synchronous Idle
  2417 17 ETB EB Symbol for End of Transmission Block
  2418 18 CAN CN Symbol for Cancel
  2419 19 EM EM Symbol for End of Medium
  241A 1A SUB SU Symbol for Substitute
  241B 1B ESC EC Symbol for Escape
  241C 1C FS FS Symbol for Field Separator (2)
  241D 1D GS GS Symbol for Group Separator (2)
  241E 1E RS RS Symbol for Record Separator (2)
  241F 1F US US Symbol for Unit Separator (2)
  2420 20 SP SP Symbol for Space (3)
  2421 7F DEL DT Symbol for Delete (3)

Notes:
  (1) Note the conflict/coincidence of these 2-character forms with hex
      bytes; see Note (3) in Section 4.
  (2) These C0 controls have alternative names, listed in Section 5.6.
  (3) Not, strictly speaking, a control character, but not a visible
      one either.

Summary:
  No new code points, but it is recommended that C0 control pictures
  be illustrated diagonally, and that the 2-letter forms be listed as
  alternatives for font designers, especially for low resolutions or
  small point sizes.

5.2. C1 Control Pictures

C1 Control characters are specified in ISO 6429 [18] (ISO Registration
Number 77 [28]) and used, among other places, in the VT220 family of
terminals [5] and the Wyse 370 [26], where they are represented in the right
half of the "display controls" font as shown in Table 5.2 (DEC terminals use
the full name, Wyse terminals use the 2X name). As with C0 controls, the
"name" is displayed diagonally within the character cell. Unicode presently
includes no C1 control pictures.

The "Code" column shows the temporary Unicode value for reference within
this document only; actual code assignments should be outside the Private
Use area. The other columns are labeled as in Table 5.1.

Table 5.2: C1 Control Characters

  Code Val Name 2X Description
         80 80 (1)
         81 81 (1)
  E022 82 BPH Symbol for Break Permitted Here (2)
  E023 83 NBH Symbol for No Break Here (2)
  E024 84 IND IN Symbol for Index (3)
  E025 85 NEL NL Symbol for Next Line
  E026 86 SSA SS Symbol for Start Selected Area
  E027 87 ESA ES Symbol for End Selected Area
  E028 88 HTS HS Symbol for Character Tabulation Set
  E029 89 HTJ HJ Symbol for Character Tabulation with Justification
  E02A 8A VTS VS Symbol for Line Tabulation Set
  E02B 8B PLD PD Symbol for Partial Line Forward
  E02C 8C PLU PU Symbol for Partial Line Backward
  E02D 8D RI RI Symbol for Reverse Line Feed
  E02E 8E SS2 S2 Symbol for Single Shift 2
  E02F 8F SS3 S3 Symbol for Single Shift 3
  E030 90 DCS DC Symbol for Device Control String
  E031 91 PU1 P1 Symbol for Private Use 1
  E032 92 PU2 P2 Symbol for Private Use 2
  E033 93 STS SE Symbol for Set Transmit State
  E034 94 CCH CC Symbol for Cancel Character
  E035 95 MW MW Symbol for Message Waiting
  E036 96 SPA SP Symbol for Start Protected (Guarded) Area
  E037 97 EPA EP Symbol for End Protected (Guarded) Area
  E038 98 SOS Symbol for Start of String (2)
         99 (1)
  E03A 9A SCI Symbol for Single Character Introducer (2)
  E03B 9B CSI CS Symbol for Control Sequence Introducer
  E03C 9C ST ST Symbol for String Terminator
  E03D 9D OSC OS Symbol for Operating System Command
  E03E 9E PM PM Symbol for Privacy Message
  E03F 9F APC AP Symbol for Application Program Command

Notes;
 (1) Undefined in ISO-6428, shown on VT220/WY370 terminal by hex byte
     symbols (see text just below these notes).
 (2) Defined in ISO-6428, but shown on VT220/WY370 terminal by hex value.
 (3) Removed from ISO-6428 in the third edition, but shown indicated on
     VT220/WY370 terminal.

Note that three of the C1 control pictures are unassigned (the ones marked
by "(1)", that would be at U+E020, U+E021, and U+E039 if these were
assigned). These positions should be left vacant in case names are assigned
to these characters in a future revision of ISO 6429, or terminals are
discovered with control pictures for these codes. In the meantime, hex
bytes are used; if a hex-byte block (Section 4) is defined, they can be
taken from that block; otherwise, the particular values shown here (80, 81,
and 99, and possibly also 98 and 9A) must be defined for this block.

As with C0 controls, it is a matter for the font designer to choose the
full designator from the Name column, or the 2-character alternatives from
the 2X column.

Summary:
  29 New characters (if hex bytes are also approved) or 32 (if they are not).

5.3. EBCDIC Control Pictures

The EBCDIC family of character sets [13,14,29] includes its own repertoire
of control characters. Many of them, like NUL, SOH, FF, SO, SI, and so on,
are coincident with ASCII C0 controls in name and semantics, and sometimes
also in encoding. Others are unique to EBCDIC.

Table 5.3 shows the EBCDIC control characters [29], in EBCDIC order. The
Code column shows the Unicode value; those starting with 24 are already in
Unicode block U+2400; those starting with E need to be added. The Val
column shows the EBCDIC value (hex). The Name column shows the EBCDIC
abbreviation for the code, and the description lists "Symbol for" plus the
EBCDIC name. There are no known "2X" forms in use.

Table 5.3: EBCDIC Control Characters

  Code Val Name Description
  2400 00 NUL Symbol for Null
  2401 01 SOH Symbol for Start of Heading
  2402 02 STX Symbol for Start of Text
  2403 03 ETX Symbol for End of Text
  E040 04 SEL Symbol for Select
  2409 05 HT Symbol for Horizontal Tab
  E041 06 RNL Symbol for Required New Line
  2421 07 DEL Symbol for Delete
  E042 08 GE Symbol for Graphic Escape
  E043 09 SPS Symbol for Superscript
  E044 0A RPT Symbol for Repeat
  240B 0B VT Symbol for Vertical Tab
  240C 0C FF Symbol for Form Feed (1)
  240D 0D CR Symbol for Carriage Return
  240E 0E SO Symbol for Shift Out
  240F 0F SI Symbol for Shift In
  2410 10 DLE Symbol for Data Link Escape
  2411 11 DC1 Symbol for Device Control 1
  2412 12 DC2 Symbol for Device Control 2
  2413 13 DC3 Symbol for Device Control 3
  E045 14 RES Symbol for Restore
  2424 15 NL Symbol for New Line
  2409 16 BS Symbol for Backspace
  E046 17 POC Symbol for Program Operator Communication
  2418 18 CAN Symbol for Cancel
  2419 19 EM Symbol for End of Medium
  E047 1A UBS Symbol for Unit Back Space
  E048 1B CU1 Symbol for Customer Use 1
  E049 1C IFS Symbol for Interchange File Separator
  E04A 1D IGS Symbol for Interchange Group Separator
  E04B 1E IRS Symbol for Interchange Record Separator
  E04C 1F IUS Symbol for Interchange Unit Separator (2)
  E04D 20 DS Symbol for Digit Select
  E04E 21 SOS Symbol for Start of Significance
  241C 22 FS Symbol for Field Separator
  E04F 23 WUS Symbol for Word Underscore
  E050 24 BYP Symbol for Bypass
  240A 25 LF Symbol for Line Feed
  2417 26 ETB Symbol for End of Transmission Block
  241B 27 ESC Symbol for Escape
  E051 28 SA Symbol for Set Attribute
  E052 29 SFE Symbol for Start Field Extended
  E053 2A SM Symbol for Set Mode (3)
  E054 2B CSP Symbol for Control Sequence Prefix
  E055 2C MFA Symbol for Modify Field Attribute
  2405 2D ENQ Symbol for Enquiry
  2406 2E ACK Symbol for Acknowledge
  2407 2F BEL Symbol for Bell
  E056 30 (Reserved by IBM for future use)
  E057 31 (Reserved by IBM for future use)
  2416 32 SYN Symbol for Synchronous Idle
  E058 33 IR Symbol for Index Return
  E059 34 PP Symbol for Presentation Position
  E05A 35 TRN Symbol for Transparent
  E05B 36 NBS Symbol for Numeric Backspace
  2404 37 EOT Symbol for End of Transmission
  E05C 38 SBS Symbol for Subscript
  E05D 39 IT Symbol for Indent Tabulation
  E05E 3A RFF Symbol for Reverse Form Feed
  E05F 3B CU3 Symbol for Customer Use 3 (4)
  2414 3C DC4 Symbol for Device Control 4
  2415 3D NAK Symbol for Negative Acknowledge
  E060 3E (Reserved by IBM for future use)
  241A 3F SUB Symbol for Substitute

Notes:
 (1) Conflict/coincidence with a hex byte; see Note (3) in Section 4.
 (2) The IUS control is sometimes also labeled ITB.
 (3) The SM control is sometimes also labeled SW (= Switch).
 (4) Note: There is no longer a Customer Use 2 (see Table 5.3A).

Summary:
  33 new characters, E040-E060, including 3 reserved.

For reference, Table 5.3A shows the original names for EBCDIC control
characters [13] that are now superseded by the names shown in Table 5.3.
It is not proposed here that these be added to Unicode.

Table 5.3A: Obsolete EBCDIC Control Characters

 Val Name Description Replaced By
  04 PF Punch Off SEL
  06 LC Lower Case RNL
  0A SMM Start of Manual Message RPT
  13 TM Tape Mark DC3
  17 IL Idle POC
  1A CC Cursor Control UBX
  2B CU2 Customer Use 2 CSP
  34 PN Punch On PP
  35 RS Record Separator TRN
  36 UC Upper Case NBS

5.4. IBM 3270 Terminal Orders and Controls

Names for IBM 3270 terminal orders and controls [27] that are not already
listed in Tables 5.1-5.3 are shown in Table 5.4, to be used in debugging
3270 data streams. Columns are as in the previous tables, except the Type
column, in which:

  O = 3270 Terminal Order [27,Table 4-1]
  D = 3270 Terminal Order in normal display [27,p.E-3]
  L = LU 1 SCS Control Codes [27,Table 8-2]
  F = 3270 Format Control Order [27,Table 4-3]

Table 5.4: 3270 Control Characters

  Code Val Name Type Description
  E070 1D SF O Symbol for Start Field
  E071 11 SBA O Symbol for Set Buffer Address
  E072 2C MF O Symbol for Modify Field
  E073 13 IC O Symbol for Insert Cursor
  E074 05 PT O Symbol for Program Tab
  E075 3C RA O Symbol for Repeat to Address
  E076 12 EUA O Symbol for Erase to Unprotected Address
  E077 04 VCS L Symbol for Vertical Channel Select
  E078 14 ENP L Symbol for Enable Presentation
  E079 24 INP L Symbol for Inhibit Presentation
  E07A 2B FMT L Symbol for Format
  E07B 1C DUP F Symbol for Duplicate
  E07C 1C DUP D Overscore asterisk (1)
  E07D 1E FM F Symbol for Field Mark
  E07E 1E FM D Overscore semicolon (1)
  E07F FF EO F Symbol for Eight Ones

Notes:
 (1) When displayed "as itself".

Summary:
  16 new characters, E070-E07F.

5.5. 3270 Terminal Operator Status Indicators

The IBM 3270 terminal displays a variety of unique glyphs in its Operator
Information Area [15, Figure A-4]. Although they are not encoded in any IBM
character set (known to me), they nevertheless appear on the screen, and are
therefore required for accurate terminal emulation. These glyphs are listed
in Table 5.5.

Table 5.5: 3270 Terminal Operator Status Indicators

  Code Description
  E080 Human stick figure
  E081 Human stick figure in box
  E082 Clock at 6:10 (or 1:30)
  E083 White rectangle with stroke (1)
  E084 Black rectangle with stroke (2)
  E085 Lighting with stroke (3)
  E086 Security key (4)
  E087 Black and White Right-Pointing Triangles (5)

Notes:
 (1) A rectangle like the one at U+25AD with an oblique stroke through it.
     Note that "white" and "black" are used in the sense of the Unicode
     standard, and do not imply any particular colors or measure of goodness.
 (2) A rectangle like the one at U+25AC with an oblique stroke through it.
 (3) A horizontal lightning symbol with an oblique stroke through it.
 (4) A picture of a key (indicating the keyboard is locked).
 (5) Like U+25B8 and U+25B9 in the same cell, arranged horizontally, left
     to right, like a double right-pointing arrowhead, used as a
     supplementary indicator.

In many cases, black and/or white rectangles (U+25AD, U+25AC, U+E083,
U+E084) are connected with a centered horizontal line such as the one at
U+2500; two rectangles connected this way generally symbolize a 3270
terminal with a printer attached. Figure 5.5 shows an example. The font
designer must ensure that a sequence: rectangle, line, rectangle, results in
a pair of connected rectangles.

Figure 5.5: Connected Rectangles

  +--------+ +--------+
  | |------| |
  +--------+ +--------+

Summary:
  8 new characters, E080-E087

5.6. Additional Control-Like Pictures

Table 5.6 shows additional characters that are (or are likely to be)
included in "display controls" mode on various terminals.

Table 5.6: Additional Control-Like Pictures

  Code Name Description
  E090 LS1 Symbol for Locking Shift 1 (1)
  E091 LS0 Symbol for Locking Shift 0 (2)
  E092 CEX Symbol for Control Extension (3)
  E093 IS4 Symbol for Information Separator 4 (4)
  E094 IS3 Symbol for Information Separator 3 (5)
  E095 IS2 Symbol for Information Separator 2 (6)
  E096 IS1 Symbol for Information Separator 1 (7)
  E097 CL Symbol for Cancel Line (8)
  E098 Picture of Bell (9)
  E099 BP Word Processing Symbol BP (10)
  E09A BE Word Processing Symbol BE (10,11)
  E09B FN Word Processing Symbol FN (10)
  E09C FE Word Processing Symbol FE (10,11)
  E09D HF Word Processing Symbol BP (10)
  2426 Symbol for Substitute Form Two (Reverse Question Mark) (12)

Notes:
 (1) ISO name for SO [18].
 (2) ISO name for SI [18].
 (3) From JIS C 6225-1979 / ISO # 74 [28].
 (4) ISO Name for FS [18].
 (5) ISO Name for GS [18].
 (6) ISO Name for RS [18].
 (7) ISO Name for US [18].
 (8) Used on HP terminals [11.12].
 (9) Used on HP terminals in place of Symbol for BEL (U+2407) [11].
(10) From the Data General Word Processing Set [2].
(11) Conflict/Coincidence with Hex Byte; see Note (3) in Section 4.
(12) The upright reverse question mark is used by DEC VT terminals to
     indicate that an invalid code was received. It also stands for SUB
     and/or RS in Wyse display controls mode [25,26], and is the glyph for
     0xFF in the Televideo Multinational Character Set [23]. And it is also
     a glyph in the DG Special Graphics Character Set [2]. This one is not
     in Unicode at present, but is encoded in Amendment 18 to ISO 10646 at
     the code point shown, with the requisite shape of reverse question mark.

Summary:
  14 characters, E090-E09D.

Section 5 Summary:
  Unicode Controls: 32 new characters, E000-E01F
  C0 Controls: 0 new characters
  C1 Controls: 32 new characters, E020-E03F
  EBCDIC Controls: 33 new characters, E040-E060
  3270 Controls: 16 new characters, E070-E07F
  3270 Indicators: 8 new characters, E080-E087
  Misc Controls: 14 new characters, E090-E09E

Total Control Pics: 135

6. MATH SYMBOLS

Unicode has a generous supply of math symbols, and no doubt more are in the
works. And of course it also includes the Latin, Greek, Fraktur, Hebrew,
and other letters used in mathematical notation.

However, terminal emulators also need special glyphs designed to be joined
together in adjacent character cells, vertically or horizontally, to form
large math symbols such as integrals, summation signs, braces, or brackets,
such as the integral top and bottom that already exist at U+2320 and U+2321.
Several other single-cell characters are also missing, including the small
radical sign from the DEC Technical character set. Table 6.1 lists the
needed characters, along with suggested temporary codes for them. At least
one real terminal reference is shown for each character, in column/row
notation, or an IBM Graphic Character Global Identifier (GCGID) [14].

Legend:
  SB = Square Bracket
  UL = Upper Left
  LL = Lower Left
  UR = Upper Right
  LR = Lower Right

Table 6.1: Math Symbols for Terminals

  Code Description Reference
  E0A0 Extensible left brace middle DEC Tech 02/15
  E0A1 Extensible left parenthesis bottom DEC Tech 02/12, IBM SS210000
  E0A2 Extensible left parenthesis top DEC Tech 02/11, IBM SS200000
  E0A3 Extensible left SB bottom DEC Tech 02/08
  E0A4 Extensible left SB top DEC Tech 02/07
  E0A5 Extensible right brace middle DEC Tech 03/00
  E0A6 Extensible UR or LL brace section IBM SS240000
  E0A7 Extensible LR or UL brace section IBM SS250000
  E0A8 Extensible right parenthesis bottom DEC Tech 02/14, IBM SS230000
  E0A9 Extensible right parenthesis top DEC Tech 02/13, IBM SS220000
  E0AA Extensible right SB bottom DEC Tech 02/10
  E0AB Extensible right SB top DEC Tech 02/08
  E0AC Summation symbol bottom DEC Tech 03/02, DG Math 01/09(1)
  E0AD Summation symbol top DEC Tech 03/01, DG Math 01/08(1)
  E0AE Right ceiling corner DEC Tech 03/05
  E0AF Right floor corner DEC Tech 03/06
  E0B0 Radical symbol, small DEC Tech 00/01
  E0B1 Radical symbol with stroke DG Math 01/13
  E0B2 Superscript Latin small letter i SNI Math 03/00
  E0B3 Latin small letter a with underbar SNI Math 04/04 (2)
  E0B4 Latin capital letter O with underbar SNI Math 04/09 (2)
  E0B5 Superscript almost-equal-to sign SNI IBM 06/12
  E0B6 Superscript capital Greek letterSigma SNI IBM 06/13
  E0B7 Superscript infinity sign SNI IBM 07/12
  E0B8 Superscript proportional-to sign SNI IBM 07/13

References:
  DEC Tech = Digital Equipment Corporation Technical Character Set [5]
  SNI Math = Siemens Nixdorf Mathematisch [21]
  SNI IBM = Siemens Nixdorf IBM [21]
  DG Math = Data General Word-Processing, Greek, and Math Character Set [2]
  IBM = IBM Graphic Character Global Identifier (GCGID) [14]

Notes:
 (1) Also GCGID SS280000 and SS29000.
 (2) These are like feminine and masculine ordinal, respectively, but full
     size, not superscripts.

Summary: 25 new characters, E0A0-E0B8.

7. LINE, BOX, AND BLOCK CHARACTERS

A particular need addressed by this proposal is the continued ability to
support (sometimes mission-critical) terminal-based forms-filling
applications that also require entry and display of international
characters, as terminals are replaced by PCs. So far, Unicode has provided
the international characters, but not necessarily all the needed
character-cell based forms-drawing capabilities.

Some terminals have vertical and horizontal lines that are not centered
within the character cell, and currently not found in Unicode. Others have
black rectangles or other shapes not found in the U+2580 block.

Table 7.1 lists the additional line, box, and block characters needed to
emulate the target terminals.

Abbreviations:
  V = Vertical
  H = Horizontal
  L = Left
  R = Right
  LL = Lower Left
  LR = Lower Right
  UL = Upper Left
  UR = Upper Right

Terminology:

Quadrant
  A black rectangle filling one quarter of a cell, with one corner in the
  center and the opposite corner at a corner of the cell. So "Quadrant UL"
  is the upper left quadrant; "Quadrant UL and UR" is the top half of the
  cell (which happens to be coincident with U+2580 and so is not included
  here).

Line
  Refers to a line that extends all the way to opposite edge(s) of a cell,
  designed to be joined to (a) line(s) in the adjacent cell(s).

Bar
  Refers to a horizontal line that does not touch any cell edges.

Wedge
  Refers to a character cell with a diagonal line connecting opposite
  corners, dividing it into two triangles; one black, the other white; the
  wedge is the black part. Thus an UL Wedge is similar to U+25E9, except it
  fills the entire character cell.

Framus
  (Pick a better word!) is a shape composed of two triangles with their
  points meeting at the center of the cell to form an X with bars across the
  top and bottom, closing the open ends. A black framus has the two
  triangles filled in; a white one is in outline form. A framus with center
  bar has a horizontal line through the center of the cell.

Figure 7.1: "Framus" Glyphs

    White Black With Bar
   ******* ******* *******
    * * ***** * *
     * * *** * *
      * * *********
     * * *** * *
    * * ***** * *
   ******* ******* *******

Table 7.1: Additional Line, Box, and Block Characters

  Code Description References
  E0D0 L V box line, extensible H19 07/12 (1)
  E0D1 R V box line, extensible H19 07/13 (1)
  E0D2 UL Wedge H19 07/02, IBM SF870000
  E0D3 UR Wedge H19 05/14, IBM SF860000
  E0D4 LL Wedge IBM SF850000
  E0D5 LR Wedge IBM SF840000
  E0D6 H line - Scan 1 DSG 06/15, H19 07/10, WG3 05/00, TVI 09/00
  E0D7 H line - Scan 3 DSG 07/00, Wyse ANSI 01/01, WG3 05/00
  E0D8 H line - Scan 5 DSG 07/01, Wyse ANSI 02/02 (2)
  E0D9 H line - Scan 7 DSG 07/02, Wyse ANSI 01/03, WG3 05/01
  E0DA H line - Scan 9 DSG 07/03, H19 07/11, WG3 05/01, TVI 09/01
  E0DB Quadrant LL H19 06/13, WG3 05/05, TVI 09/05
  E0DC Quadrant LR H19 06/12, WG3 05/04, TVI 09/04
  E0DD Quadrant UL H19 06/14, WG3 05/06, TVI 09/06
  E0DE Quadrant UL and LL and LR WG3 05/11, TVI 09/11
  E0DF Quadrant UL and LR H19 06/10 (3)
  E0E0 Quadrant UL and UR and LL WG3 05/12, TVI 09/12
  E0E1 Quadrant UL and UR and LR WG3 05/13, TVI 09/13
  E0E2 Quadrant UR H19 111, WG3 83, TVI 09/03
  E0E3 Quadrant UR and LL (for completeness)
  E0E4 Quadrant UR and LL and LR WG3 05/14, TVI 09/14
  E0E5 Full black diamond TVI 09/02 (4)
  E0E6 Black framus DGM 06/08
  E0E7 Black framus + H center bar DGM 06/09
  E0E8 White framus DGM 06/10
  E0E9 White framus + H center bar DGM 06/11
  E0EA R & L arrow to V center bar DGM 03/13
  E0EB Up arrow to H center line DGL 02/12
  E0EC R arrow to V center line DGL 02/13
  E0ED L arrow to V center line DGL 02/14
  E0EE Down arrow to H center line DGL 02/12
  E0EF Box drawing double dash H DGL 03/12 (5)

References:
  DGM = Data General Word-Processing, Greek, and Math Character Set [2]
  DGL = Data General Line Drawing Character Set [2]
  DSG = The DEC Special Graphics Character Set [5]
  H19 = The Heath/Zenith 19 Graphics Character Set [10]
  WG3 = The Wyse Graphics 3 Character Set [25]
  TVI = The Televideo 965 Multinational Character Set [23]
  IBM = Graphic Character Global Identifier (GCGID) [14]
  Wyse ANSI = Wyse 60 "Standard ANSI", "UK ANSI", and "ANSI Graphics" [25]

Notes:
  (1) The vertical box lines are near, but not touching, the left and right
      edges of the cell, respectively, and are two pixels thick on the H19
      screen. Similar to IBM GCID SF640000 and SF650000, respectively.
  (2) A centered horizontal is already in Unicode U+2500, but this one might
      need to be encoded separately if existing one does not mesh well with
      other line and box characters.
  (3) Only on Zenith models, not original Heathkits.
  (4) Full black diamond, with points touching center of each cell wall.
  (5) Similar to U+2504 but double rather than triple.

Also note that Quadrants UL+UR, UR+LR, LL+LR, UL+LL (half blocks) are
already encoded at block U+2580.

Summary:
 32 New glyphs, Range E0D0 to E0EF.

8. UNFINISHED BUSINESS

The selection of characters presented in this draft is far from
comprehensive. Hundreds of other terminals from the past 30+ years are
likely to have glyphs or entire character sets covered neither here nor
in Unicode, and these might or might not be important in some application
somewhere. Readers are invited, therefore, to propose any needed
additions, bearing in mind that Unicode code space is not unlimited.

Several character sets found in the references consulted are ignored here,
fully or in part, due to lack of motivation (nobody has ever asked us, in
our role of terminal emulator maker, to support them). Obviously these, and
any other missing sets (such as the many Videotex/Teletex/etc mosaic sets),
can be considered if there is a demand.

Siemens Nixdorf Facet
  A set of 95 mosaic graphics, but not resembling any of the ISO Videotex
  mosaic sets; difficult to describe.

Siemens Nixdorf Klammern (Brackets)
  A set of 95 assorted blobs, bracket and brace pieces, clocks, arrows,
  hourglasses, and Greek letters, some of which are unique; others can be
  unified with existing Unicode characters or characters in this proposal.

Hewlett Packard Line Drawing
  Mostly coincident with Unicode box-drawing set at U+2500, but with a
  handful of unique characters, such as single-to-triple box intersections,
  single-to-double intersections with wide spacing, etc. These should be
  mappable to existing U+25xx glyphs without causing riots in the streets.

Hewlett Packard Big Character Pieces
  Thick line segments for drawing large characters, used on the HP-2648.

9. SUMMARY OF PROPOSED ADDITIONAL CHARACTERS

If all the proposed new characters are added to the UCS, this will enable
terminal emulators to fully handle at least the following terminal character
sets, which were not previously covered in full:

  ASCII/ISO Display Controls for DEC, Hewlett Packard, Televideo, and others.
  EBCDIC Display Controls for the IBM 3270
  Hexadecimal debugging
  DEC Technical
  DEC Special Graphics
  Data General Word-Processing, Greek, and Math (1)
  Data General Line Drawing
  Heath/Zenith 19 Graphics
  Hewlett Packard 2621 and HPTERM
  Siemens Nixdorf's "IBM" set (plus parts of its Klammern and Facet sets)
  Televideo Multinational
  Wyse Graphics 3 (Graphics 1 and 2 were already covered)
  Wyse "Standard ANSI", "UK ANSI", and "ANSI Graphics"

Notes:
 (1) Except the DG logo character, which is presumed off limits.

Terminals supporting these character sets are numerous indeed. An
incomplete list includes: DEC VT100, VT102, VT220/240, VT320/330/340, VT420,
VT520/525; Data General 210, 215, 217, 413, and 463; the Heath / Zenith 19;
the Perkin Elmer 550 and 1100; and numerous Televideo and Wyse models.

The new characters proposed in this document are listed in Table 10.1.

Priorities:

For terminal emulation the most important categories are, in descending order:
 1. Line, Box, and Block characters
 2. Extensible math symbols
 3. C1 and EBCDIC control pictures
 4. Hex bytes

For adding debugging capabilities to Unicode applications in general:
 1. Hex bytes
 2. Unicode control pictures
 3. C1 and EBCDIC control pictures

Table 10.1: Census of New Characters

  Code Description
  E000 Symbol for En Quad
  E001 Symbol for Em Quad
  E002 Symbol for En Space
  E003 Symbol for Em Space
  E004 Symbol for Three-Per-Em-Space
  E005 Symbol for Four-Per-Em-Space
  E006 Symbol for Six-Per-Em-Space
  E007 Symbol for Figure Space
  E008 Symbol for Punctuation Space
  E009 Symbol for Thin Space
  E00A Symbol for Hair Space
  E00B Symbol for Zero-Width Space
  E00C Symbol for Zero-Width Non-Joiner
  E00D Symbol for Zero-Width Joiner
  E00E Symbol for Left-to-Right Mark
  E00F Symbol for Right-to-Left Mark
  E010 Symbol for Line Separator
  E011 Symbol for Paragraph Separator
  E012 Symbol for Left-to-Right Embedding
  E013 Symbol for Right-to-Left Embedding
  E014 Symbol for Pop Directional Formatting
  E015 Symbol for Left-to-Right Override
  E016 Symbol for Right-to-Left Override
  E017 Symbol for Inhibit Symmetric Swapping
  E018 Symbol for Activate Symmetric Swapping
  E019 Symbol for Inhibit Arabic Form Shaping
  E01A Symbol for Activate Arabic Form Shaping
  E01B Symbol for National Digit Shapes
  E01C Symbol for Nominal Digit Shapes
  E01D Symbol for Zero Width No Break Space
  E01E Symbol for Not A Character (Byte Order)
  E01F Symbol for Not A Character
  E020 (Reserved)
  E021 (Reserved)
  E022 Symbol for Break Permitted Here
  E023 Symbol for No Break Here
  E024 Symbol for Index
  E025 Symbol for Next Line
  E026 Symbol for Start Selected Area
  E027 Symbol for End Selected Area
  E028 Symbol for Character Tabulation Set
  E029 Symbol for Character Tabulation with Justification
  E02A Symbol for Line Tabulation Set
  E02B Symbol for Partial Line Forward
  E02C Symbol for Partial Line Backward
  E02D Symbol for Reverse Line Feed
  E02E Symbol for Single Shift 2
  E02F Symbol for Single Shift 3
  E030 Symbol for Device Control String
  E031 Symbol for Private Use 1
  E032 Symbol for Private Use 2
  E033 Symbol for Set Transmit State
  E034 Symbol for Cancel Character
  E035 Symbol for Message Waiting
  E036 Symbol for Start Protected (Guarded) Area
  E037 Symbol for End Protected (Guarded) Area
  E038 Symbol for Start of String
  E039 (Reserved)
  E03A Symbol for Single Character Introducer
  E03B Symbol for Control Sequence Introducer
  E03C Symbol for String Terminator
  E03D Symbol for Operating System Command
  E03E Symbol for Privacy Message
  E03F Symbol for Application Program Command
  E040 Symbol for Select
  E041 Symbol for Required New Line
  E042 Symbol for Graphic Escape
  E043 Symbol for Superscript
  E044 Symbol for Repeat
  E045 Symbol for Restore
  E046 Symbol for Program Operator Communication
  E047 Symbol for Unit Back Space
  E048 Symbol for Customer Use 1
  E049 Symbol for Interchange File Separator
  E04A Symbol for Interchange Group Separator
  E04B Symbol for Interchange Record Separator
  E04C Symbol for Interchange Unit Separator
  E04D Symbol for Digit Select
  E04E Symbol for Start of Significance
  E04F Symbol for Word Underscore
  E050 Symbol for Bypass
  E051 Symbol for Set Attribute
  E052 Symbol for Start Field Extended
  E053 Symbol for Set Mode
  E054 Symbol for Control Sequence Prefix
  E055 Symbol for Modify Field Attribute
  E056 (Reserved)
  E057 (Reserved)
  E058 Symbol for Index Return
  E059 Symbol for Presentation Position
  E05A Symbol for Transparent
  E05B Symbol for Numeric Backspace
  E05C Symbol for Subscript
  E05D Symbol for Indent Tabulation
  E05E Symbol for Reverse Form Feed
  E05F Symbol for Customer Use 3
  E060 (Reserved)
  E070 Symbol for Start Field
  E071 Symbol for Set Buffer Address
  E072 Symbol for Modify Field
  E073 Symbol for Insert Cursor
  E074 Symbol for Program Tab
  E075 Symbol for Repeat to Address
  E076 Symbol for Erase to Unprotected Address
  E077 Symbol for Vertical Channel Select
  E078 Symbol for Enable Presentation
  E079 Symbol for Inhibit Presentation
  E07A Symbol for Format
  E07B Symbol for Duplicate
  E07C Overscore asterisk
  E07D Symbol for Field Mark
  E07E Overscore semicolon
  E07F Symbol for Eight Ones
  E080 Human stick figure
  E081 Human stick figure in box
  E082 Clock at 6:10 (or 1:30)
  E083 White rectangle with stroke
  E084 Black rectangle with stroke
  E085 Lighting with stroke
  E086 Security key
  E087 Black and White Right-Pointing Triangles
  E090 Symbol for Locking Shift 1
  E091 Symbol for Locking Shift 0
  E092 Symbol for Control Extension
  E093 Symbol for Information Separator 4
  E094 Symbol for Information Separator 3
  E095 Symbol for Information Separator 2
  E096 Symbol for Information Separator 1
  E097 Symbol for Cancel Line
  E098 Picture of Bell
  E099 Word Processing Symbol BP
  E09A Word Processing Symbol BE
  E09B Word Processing Symbol FN
  E09C Word Processing Symbol FE
  E09D Word Processing Symbol BP
  E0A0 Extensible left brace middle
  E0A1 Extensible left parenthesis bottom
  E0A2 Extensible left parenthesis top
  E0A3 Extensible left SB bottom
  E0A4 Extensible left SB top
  E0A5 Extensible right brace middle
  E0A6 Extensible UR or LL brace section
  E0A7 Extensible LR or UL brace section
  E0A8 Extensible right parenthesis bottom
  E0A9 Extensible right parenthesis top
  E0AA Extensible right SB bottom
  E0AB Extensible right SB top
  E0AC Summation symbol bottom
  E0AD Summation symbol top
  E0AE Right ceiling corner
  E0AF Right floor corner
  E0B0 Radical symbol, small
  E0B1 Radical symbol with stroke
  E0B2 Superscript Latin small letter i
  E0B3 Latin small letter a with underbar
  E0B4 Latin capital letter O with underbar
  E0B5 Superscript almost-equal-to sign
  E0B6 Superscript capital Greek letter Sigma
  E0B7 Superscript infinity sign
  E0B8 Superscript proportional-to sign
  E0D0 L V box line, extensible
  E0D1 R V box line, extensible
  E0D2 UL Wedge
  E0D3 UR Wedge
  E0D4 LL Wedge
  E0D5 LR Wedge
  E0D6 H line - Scan 1
  E0D7 H line - Scan 3
  E0D8 H line - Scan 5
  E0D9 H line - Scan 7
  E0DA H line - Scan 9
  E0DB Quadrant LL
  E0DC Quadrant LR
  E0DD Quadrant UL
  E0DE Quadrant UL and LL and LR
  E0DF Quadrant UL and LR
  E0E0 Quadrant UL and UR and LL
  E0E1 Quadrant UL and UR and LR
  E0E2 Quadrant UR
  E0E3 Quadrant UR and LL
  E0E4 Quadrant UR and LL and LR
  E0E5 Full black diamond
  E0E6 Black framus
  E0E7 Black framus + H center bar
  E0E8 White framus
  E0E9 White framus + H center bar
  E0EA R & L arrow to V center bar
  E0EB Up arrow to H center line
  E0EC R arrow to V center line
  E0ED L arrow to V center line
  E0EE Down arrow to H center line
  E0EF Box drawing double dash H
  E100 Symbol for Hex Byte 00
  E101 Symbol for Hex Byte 01
  : :
  E1FF Symbol for Hex Byte FF

Summary:
  Hex bytes: 256
  Control pictures: 135
    Unicode Controls: 32
    C0 Controls: 0
    C1 Controls: 32
    EBCDIC Controls: 33
    3270 Controls: 16
    3270 Indicators: 8
    Misc Controls: 14
  Math Symbols: 25
  Line/Box/Block: 32

Total: 448

10. REFERENCES

 [1] American National Standards Institute, ANSI X3.4-1986, Code for
     Information Interchange (ASCII), 1986.

 [2] Data General, Programming the Display Terminal: Models D217, D413, and
     D463, Westboro, MA, 1991.

 [3] Digital Equipment Corporation, VT100 User Guide, EK-VT100-UG-002,
     Maynard, MA, 1979.

 [4] Digital Equipment Corporation, VT100 Video Terminal User Guide,
     EK-VT102-UG-003, Maynard, MA, 1982.

 [5] Digital Equipment Corporation, VT220 Owner's Manual, EK-VT220-UG-003,
     Maynard, MA, 1984.

 [6] Digital Equipment Corporation, VT220 Series Programmer Reference
     Manual, EK-VT240-RM-002, Maynard, MA, 1984.

 [7] Digital Equipment Corporation, VT330/VT340 Programmer Reference Manual,
     Volume 1: Text Programming, ED-VT3XX-TP-002, Maynard, MA, 1988.

 [8] Digital Equipment Corporation, Installing and Using the VT420 Video
     Terminal EK-VT420-UG.002, Maynard, MA, 1988.

 [9] Digital Equipment Corporation, VT520/VT525 Video Terminal Programmer
     Inforamtion, EK-VT520-RM.A01, Maynard, MA, 1994.

[10] Heathkit Manual for the Video Terminal Model H19, The Heath Company,
     Benton Harbor, MI, 1979.

[11] Hewlett Packard 2621A/P Interactive Terminal Owner's Manual, 1978.

[12] Hewlett Packard 2648A Graphics Terminal Reference Manual, 1977.

[13] IBM System/360 Principles of Operation, GA22-6821-8, Poughkeepsie,
     NY, 1970.

[14] IBM National Language Design Guide, Volume 2: National Language
     Support Reference Manual, 4th Edition, North York, ON, 1994.

[15] IBM 3270 Information Display System, Component Description,
     GA27-2749-10, 1980.

[16] IBM 3164 ASCII Color Display Station Description, GA18-2317-1, 1986.

[17] ISO International Standard 2022, Information processing -- ISO
     7-bit and 8-bit coded character sets -- Code extension techniques,
     Third Edition, Geneva, 1986.

[18] ISO/IEC International Standard 6429, Information technology --
     Control functions for coded character sets, Third Edition, Geneva, 1992.

[19] ISO/IEC 10646-1, International Standard 10646,
     Information Processing -- Multiple-Octet Coded Character Set,
     1993-now.

[20] Perkin Elmer Model 1100 User's Manual, Randolph, NJ, 1978.

[21] Siemens Nixdorf, Bildschirmeinheit 97801-5xx Schnittstellen,
     Benutzerhandbuch, M|nchen, 1991.

[22] Televideo 922 Video Terminal Display Operator's Manual, Sunnyvale, CA,
     1984.

[23] Televideo 922 Video Terminal Display Operator's Manual, Sunnyvale, CA,
     1988.

[24] The Unicode Standard, Version 2.0, Addison-Wesley Developers
     Press, 1996.

[25] Wyse WY-60 Programmer's Guide, Wyse Technology, San Jose, CA, 1987.

[26] Wyse WY-370 Programmer's Guide, Wyse Technology, San Jose, CA, 1990.

[27] IBM 3270 Information Display System, Data Stream Programmer's Reference,
     GA23-0059-06, 1991.

[28] ISO International Register of Coded Characters to Be Used with Escape
     Sequences, European Computer Manufacturers Association (ECMA), Geneva,
     1985-present.

[29] IBM Character Data Representation Architecture, Level 1 Registry,
     IBM Canada Ltd., National Language Technical Centre, Ontario,
     SC09-1391-00, 1990.

(End)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT