Control Code usage

From: Mark Davis (mark_davis@taligent.com)
Date: Thu Nov 02 1995 - 14:53:53 EST


Subject: Control Code usage Time: 10:16 Date: 11/2/95

I am trying to collect some information on the use of control characters in
different text representations. I would appreciate it if anyone has additional
information on these.

As far as Tab/Line/Paragraph/Page separators go, my understanding is that the
following are used on these platforms:

         Tab LS PS PgS
Unix HT LF none FF
Windows HT CRLF none FF
Mac HT none CR FF

However, in particular word processors, different conventions may be used. For
example, MS Word on the Mac appears to use following control characters.

Mn Code Meaning Keys Unicode

HT 09 Tab Separator Tab 0009*
VT 0B Line Separator Shift Return 2028
CR 0D Paragraph Separator Return 2029
FF 0C Page Separator Shift Enter 000C*

RS 1E Non-Breaking Hyphen Command ~ 2011
US 1F Optional Hyphen Command - 00AD
    CA No-break Space Option Space 00A0

* Although Unicode 1.1 does not specify the codes for tab and page separator,
these are in common enough usage that they can be relied upon.

Section Separator appears to be just a Page Separator with out-of-band info,
and there does not appear to be a Column Separator. The other word processors
that I checked out do not appear to use control codes for these functions.

On Windows, I would presume that these assignments are similar, but that Word
uses 0xAD & 0xA0 for optional-hyphen & no-break space (since those are in
Latin-1), and CRLF for Paragraph separator (can some of the MS people verify
these?).

I would appreciate it if anyone can supply additional information on platform
and application control-code usage.

Mark



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:30 EDT