From: Kenneth Whistler (kenw@sybase.com)
Date: Mon Apr 13 2009 - 18:43:30 CDT
Dennie Heuer suggested:
> ... this is why i think that unicode should support the inclusion (or
> embedding) of other character sets. it should not know about them and
> how to specify them. this is the matter of a different standard.
Ah, but that standard already exists. You are reinventing
ISO 2022:
http://en.wikipedia.org/wiki/ISO_2022
> (the
> easiest way is to name or number them offcially.)
And that also exists. It is called the International Register of Coded
Character Sets to be Used with Escape Sequences:
http://www.itscj.ipsj.or.jp/ISO-IR/
> however, it should
> provide a character to mark the position at which unicode is 'closed'
> or 'left'.
And that is defined by ISO/IEC 10646 itself:
"When the escape sequences from ISO/IEC 2022 are used, the
identification of a return, or transfer, from UCS to the
coding system of ISO/IEC 2022 shall be by the escape sequence
ESC 02/05 04/00. ..."
So the escape sequence <U+001B, U+0025, U+0040> gets you
from Unicode to ISO 2022, if you want to embed other
character sets using the mechanisms of that standard.
A warning though: ISO 2022 is basically an implementation flop, outside
of the limited context in which it is used for character sets
supported in East Asian email contexts: ISO-2022-JP,
ISO-2022-CN, etc.
And I rather doubt that turning an escape sequence (which at
least has the advantage of being a widely understood and
somewhat implemented mechanism) into a single character exit
code would change anything -- you still end up with a stateful
encoding of the very type that Unicode was invented to get
away from.
--Ken
P.S. If you *really* want a single character exit code, that
*also* exists already: U+000E SHIFT OUT. But no Unicode
systems implement that as a character set exit control code,
for good reasons.
This archive was generated by hypermail 2.1.5 : Mon Apr 13 2009 - 18:45:02 CDT