Please fill
Sections A, B and C below. Section D will be filled by SC 2/WG 2.
For
instructions and guidance for filling in the form please see the document
" Principles and Procedures for Allocation of New Characters and
Scripts" (http://www.dkuug.dk/JTC1/SC2/WG2/prot)
1. Title: Disunify
braces/brackets for math, computing science, and Z notation from
similar-looking CJK braces/brackets
2. Requester's name: Kent
Karlsson (and Asmus Freytag?)
3. Requester type (Member
body/Liaison/Individual contribution):
4. Submission date: 2001-01-16
5. Requester's reference (if
applicable):
6. (Choose one of the
following:)
This is a complete proposal
1.b. The proposal is for
addition of character(s) to an existing block: X
Name of the existing block: Miscellaneous mathematical symbols
2. Number of characters in
proposal: 6
3. Proposed category (see
section II, Character Categories):
4. Proposed Level of
Implementation (see clause 15, ISO/IEC 10646-1): 1
Is a rationale provided for the choice? Yes
If Yes, reference: (simple graphical characters, no combining or other
implementation difficulties)
5. Is a repertoire including
character names provided?: Yes
a. If YES, are the names in
accordance with the 'character naming guidelines' in Annex K
of ISO/IEC 10646-1? Yes
b. Are the character shapes attached in a reviewable form? […]
6. Who will provide the
appropriate computerized font (ordered preference: True Type, PostScript or
96x96 bit-mapped format) for publishing the standard? Unicode Consortium?
If available now, identify source(s) for the font (include address, e-mail,
ftp-site, etc.) and indicate the tools used:
7. References:
a. Are references (to other character sets, dictionaries, descriptive texts
etc.) provided?
b. Are published examples
(such as samples from newspapers, magazines, or
other sources) of use of proposed characters attached?
8. Special encoding issues:
Does the proposal address
other aspects of character data processing (if applicable) such as input,
presentation, sorting, searching, indexing, transliteration etc. (if yes please
enclose information): No.
1. Has this proposal for
addition of character(s) been submitted before? No.
If YES explain
2. Has contact been made to
members of the user community (for example: National Body, user groups of the
script or characters, other experts, etc.)?
If YES, with whom?
If YES, available relevant documents?
3. Information on the user
community for the proposed characters (for example: size,
demographics, information technology use, or publishing use) is included? No.
Reference:
4. The context of use for the
proposed characters (type of use, common or rare) Common in math, computing
science, and Z notation
Reference:
5. Are the proposed
characters in current use by the user community? Yes
If YES, where? Reference:
6. After giving due
considerations to the principles in N 1352 must the proposed
characters be entirely in the BMP? Yes
If YES, is a rationale provided? Yes
If YES, reference: (Co-location with similar (and less used) characters in
the misc. math. symbols block.)
7. Should the proposed
characters be kept together in a contiguous range (rather than
being scattered)? Nearly (see detailed proposal below).
8. Can any of the proposed
characters be considered a presentation form of an existing
character or character sequence? No.
If YES, is a rationale for its inclusion provided?
If YES, reference:
9. Can any of the proposed
character(s) be considered to be similar (in appearance or function) to an
existing character? Yes.
If YES, is a rationale for its inclusion provided? Yes.
If YES, reference: (Though similar in appearance to some CJK punctuation,
the use context, typographic appearance, and typographic spacing properties are
different.)
10. Does the proposal include
use of combining characters and/or use of composite sequences
(see clause 4.11 and 4.13 in ISO/IEC 10646-1)? No.
If YES, is a rationale for such use provided?
If YES, reference:
Is a list of composite sequences and their corresponding glyph images (graphic
symbols) provided?
If YES, reference: N/A
11. Does the proposal contain
characters with any special properties such as control function or similar
semantics? No.
If YES, describe in detail (include attachment if necessary)
1. Relevant SC 2/WG 2
document numbers:
2. Status (list of meeting
number and corresponding action or disposition):
3. Additional contact to user
communities, liaison organizations etc:
4. Assigned category and
assigned priority/time frame:
A. List of suggested characters
List with suggested code position, and suggested name,
of the six characters suggested by this proposal:
2997, LEFT DOUBLE SQUARE BRACKET
2998, RIGHT DOUBLE SQUARE BRACKET
29D8, LEFT ANGLE BRACE
29D9, RIGHT ANGLE BRACE
29DA, LEFT DOUBLE ANGLE BRACE
29DB, RIGHT DOUBLE ANGLE BRACE
B. Use of the suggested characters
The double square brackets,
2997, LEFT DOUBLE SQUARE BRACKET
2998, RIGHT DOUBLE SQUARE BRACKET
are commonly used in computing science (and Z notation) as “abstract
syntax” brackets. In papers or books
they are usually produced by kerning [[ and ]] respectively till the glyphs
touch (sometimes using custom-made TeX commands, by varying names, if TeX is
used).
The single angle braces,
29D8, LEFT ANGLE BRACE
29D9, RIGHT ANGLE BRACE
are commonly used in math and computing science as tuple brackets (or
sequence bracket, as in Z notation). These are produced in LaTeX by \langle
and \rangle.
The double angle braces,
29DA, LEFT DOUBLE ANGLE BRACE
29DB, RIGHT DOUBLE ANGLE BRACE
are used in Z notation as data braces. These may be produced in LaTeX by
custom-made \ldata and \rdata
commands.
C. Similar characters
Note that the “miscellaneous technical” symbols
2329, LEFT-POINTING ANGLE BRACKET
232A, RIGHT-POINTING ANGLE BRACKET
are in Unicode canonically equivalent to
3008, LEFT ANGLE BRACKET
3009, RIGHT ANGLE BRACKET
respectively.
To make these canonically equivalent may have been a mistake, but the
equivalence is firmly entrenched, and cannot now be revoked.
3008 and 3009 are CJK punctuation characters, similar
in use to
2039, SINGLE LEFT-POINTING ANGLE QUOTATION MARK
203A, SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
and used in normal running text.
U+3008 (U+2329) and U+3009 (U+232A) are typeset with extra white-space
on the outer side to make them each as wide as a CJK ideograph. This makes 2329/3008 and 232A/3009
unsuitable for the common math expression use.
Further,
300A, LEFT DOUBLE ANGLE BRACKET
300B, RIGHT DOUBLE ANGLE BRACKET
are also CJK punctuation characters, similar in use to
00AB, LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
00BB, RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
and used in normal running text.
Also U+300A and U+300B are typeset with extra
white-space on the outer side to make them each as wide as a CJK ideograph. This makes 300A and 300B unsuitable for the
math (Z notation in particular) expression use.
Finally,
301A, LEFT WHITE SQUARE BRACKET
301B, RIGHT WHITE SQUARE BRACKET
are also CJK punctuation.
Also U+301A and U+301B are typeset with extra white-space on the outer
side to make them each as wide as a CJK ideograph. This makes 301A and 301B
unsuitable for the computing science expression use.
D. Discussion
This is a proposal to disunify the math brackets from the CJK quotation
brackets, adding the math versions to the new miscellaneous math symbols block,
where there are similar brackets/braces.
If possible, that block could be rearranged to put all the
brackets/braces (including these disunified ones) together.
Since the characters similar to the ones here suggested are in the CJK
punctuation block,
they are not widely recognised as “mathematical” characters, and font
and other support that otherwise try to cover “mathematical” characters usually
misses out on all CJK punctuation (including standard “character collections”
as listed in 10646 itself), even though some of the CJK punctuation is
currently unified with the here suggested “mathematical” characters.
Further, e.g., for the single LEFT ANGLE characters (for illustration
here written as <) if two of them are displayed (printed) together they
would look like this “<<” in Latin typography and like this
“ < <” (note the white-space to the left of each <
glyph) in East Asian typography. An
appearance like “ < <” is not acceptable in East Asian
typography either, so software recognizes the character (not the glyph)
and kern these as follows: “ <<” (notice that the space on the left
of the first character remains). Similarly
for the other CJK punctuation characters. This kerning is often called 'ideal
width' or 'algorithmic kerning' since it does not use glyph metrics, but
assumes the presence of the white space by knowledge of the character code.
This is very different from what is done in Latin typography, but normal in
East Asian usage.
Because of these reasons, disunifying the brackets/braces, adding new
“mathematical” ones in the new Miscellaneous Mathematical Symbols block, as
suggested, where similar “math” characters reside, is useful.
ISO/IEC WD2.6 13568, Formal Specification – Z Notation – Syntax, type
and semantics.
TeX, LaTeX books…
Math text books…
CS text books…