From: Doug Ewell (doug@ewellic.org)
Date: Mon Nov 01 2010 - 15:50:51 CST
I'd like to try to gauge the community's interest, if any, in some
possible updates to UTS #6 and the SCSU mechanism, as follows:
(1) Updating the spec to add dynamic-window offsets 0xA8 through 0xBF,
to permit encoding the blocks from U+A000 through U+ABFF in single-byte
mode. This would allow the many small alphabets assigned to this range,
such as Bamum and Syloti Nagri and Phags-Pa, to be encoded efficiently
using SCSU. Other offsets could be added as well, such as for Hangul
Jamo Extended-B.
(2) Updating the spec to assign "reserved" tag bytes 0x0C (single-byte
mode) and 0xF2 (Unicode mode) as "reset all" commands, similar to 0xFF
in BOCU-1. This would allow more efficient encoding in some cases, as
well as providing a possible synchronization mechanism for decoders. As
an alternative, these unused tag bytes could be released for normal,
non-reserved use, so they would no longer require escaping.
(3) Providing an informational section in UTS #6 on "line-safe SCSU," a
special-purpose SCSU encoding profile in which all state is returned to
the default at the end of each line, and all lines are terminated with
CR/LF.
I'm aware that many people have been discouraging the use of SCSU
altogether, on the basis of Web-page security concerns or the reputation
of SCSU as "difficult to implement." These people will not be affected
one way or another by any enhancements to SCSU, and I am not focusing on
them at present.
-- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ is dot gd slash 2kf0s
This archive was generated by hypermail 2.1.5 : Mon Nov 01 2010 - 15:54:13 CST