On Mon, 22 Aug 2011 16:18:56 -0700
Ken Whistler <kenw_at_sybase.com> wrote:
> How about Clause 12.5 of ISO/IEC 10646:
>
> <001B, 0025, 0040>
>
> You "escape" out of UTF-16 to ISO 2022, and then you can do whatever
> the heck you want, including exchange and processing of complete
> 4-byte forms, with all the billions of characters folks seem to think
> they need.
> Of course you would have to convince implementers to honor the ISO
> 2022 escape sequence...
Which they only need to if the text is in an ISO 2022 or similar
context. Your idea does suggest that a pattern of
<high><high><SO><low> would be reasonable. The shift-out code U+000E
has no meaning as a Unicode character so it wouldn't be unreasonable to
require a special check that one finds a full character if looking for
a one-character string consisting only of U+000E. We could also have
<high><high><SI><low> to gives the full *two* thousand million odd
characters that would be resupported by UTF-32.
Richard.
Received on Tue Aug 23 2011 - 14:03:20 CDT
This archive was generated by hypermail 2.2.0 : Tue Aug 23 2011 - 14:03:22 CDT