Frank asked:
>
> Can anybody tell me where to find out what ISO means when it assigns an ISO
> 2022 escape sequence for a "coding system different from ISO 2022" (such as,
> for example, NAPLPS, or UCS-4, or UTF-8)? Is the intention to identify the
> coding system to the recipient, so it can switch to it, and also disable
> ISO-2022 character-set designation and invocation from that moment onwards,
> since we have now switched to a new coding system in which we will not
> necessarily be able to recognize escape sequences for further switching?
>
> In particular, I'm curious about an environment in which the host switches
> the terminal to the UTF-8 coding system. Since Unicode includes ASCII as
> well as C0 and C1 controls (and so UTF-8 can include both sets of controls
> too), should it be possible to switch out of UTF-8 coding once having
> switched into it? (I know, why would anybody ever want to switch out of
> UTF-8? :-)
This stuff is all laid out in excruciating detail in 10646 as regards
10646 and its encoding forms in particular.
Amendment 2 to 10646 (UTf-8) states (among other things):
"When the escape sequences from ISO/IEC 2022 are used, the
identification of a return, or transfer, from UTF-8 to the
coding system of ISO/IEC 2022 shall be as specified in
17.5 for a return or transfer from UCS."
And clause 17.5 of 10646 states:
"When the escape sequences form ISO/IEC 2022 are used, the
identification of a return, or transfer, from UCS to the coding
system of ISO/IEC 2022 shall be by the escape sequence ESC 02/05 04/00.
If such an escape sequence apears within a CC-data-element conforming
to ISO/IEC 10646, it shall be padded in accordance with clause 16.
If such an escape sequence appears within a CC-data-element conforming to
ISO/IEC 2022, it shall consist only of the sequences of bit combinations
as shown above."
In other words to get from UCS-2 (or UTF-16) to 2022, you
use U+001B U+0025 U+0040. To get from UTF-8 to 2022, you
use 0x1B 0x25 0x40. (ESC "%@") For UCS-4, it would be
U-0000001B U-00000025 U-00000040.
--Ken
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:42 EDT