From: Tex Texin (textexin@xencraft.com)
Date: Mon Apr 13 2009 - 19:46:17 CDT
More importantly there are international standards for formatting text
called markup languages, that are much more powerful than a small set of
control character commands would be, and which coexist quite well with
Unicode encoding and are portable.
Given the existence and wide support for HTML, etc., the case for such
commands in Unicode is extremely weak.
It would also create problems to now have commands in Unicode which would
potentially interact or conflict with higher level formatting commands.
tex
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
Behalf Of Kenneth Whistler
Sent: Monday, April 13, 2009 4:44 PM
To: dh@triple-media.com
Cc: unicode@unicode.org; kenw@sybase.com
Subject: Re: proposal for a "Standard-Exit" or "Namespace" character
Dennie Heuer suggested:
> ... this is why i think that unicode should support the inclusion (or
> embedding) of other character sets. it should not know about them and
> how to specify them. this is the matter of a different standard.
Ah, but that standard already exists. You are reinventing
ISO 2022:
http://en.wikipedia.org/wiki/ISO_2022
> (the
> easiest way is to name or number them offcially.)
And that also exists. It is called the International Register of Coded
Character Sets to be Used with Escape Sequences:
http://www.itscj.ipsj.or.jp/ISO-IR/
> however, it should
> provide a character to mark the position at which unicode is 'closed'
> or 'left'.
And that is defined by ISO/IEC 10646 itself:
"When the escape sequences from ISO/IEC 2022 are used, the
identification of a return, or transfer, from UCS to the
coding system of ISO/IEC 2022 shall be by the escape sequence
ESC 02/05 04/00. ..."
So the escape sequence <U+001B, U+0025, U+0040> gets you
from Unicode to ISO 2022, if you want to embed other
character sets using the mechanisms of that standard.
A warning though: ISO 2022 is basically an implementation flop, outside
of the limited context in which it is used for character sets
supported in East Asian email contexts: ISO-2022-JP,
ISO-2022-CN, etc.
And I rather doubt that turning an escape sequence (which at
least has the advantage of being a widely understood and
somewhat implemented mechanism) into a single character exit
code would change anything -- you still end up with a stateful
encoding of the very type that Unicode was invented to get
away from.
--Ken
P.S. If you *really* want a single character exit code, that
*also* exists already: U+000E SHIFT OUT. But no Unicode
systems implement that as a character set exit control code,
for good reasons.
This archive was generated by hypermail 2.1.5 : Mon Apr 13 2009 - 19:48:33 CDT