Re: Corrigendum #9

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Tue, 3 Jun 2014 00:33:58 +0100

On Mon, 2 Jun 2014 15:09:21 -0700
David Starner <prosfilaes_at_gmail.com> wrote:

> So certain programs can't use noncharacters internally because some
> people want to interchange them? That doesn't seem like what
> noncharacters should be used for.

Much as I don't like their uninvited use, it is possible to pass them
and other undesirables through most applications by a slight bit of
recoding at the application's boundaries. Using 99 = (3 + 32 + 64) PUA
characters, one can ape UTF-16 surrogates and encode:

32 × 64 pairs for lone surrogates
 1 × 64 pairs to replace some of the PUA characters
 1 × 35 pairs to replace the rest of the PUA characters
 1 × 4 pairs for incoming FFFC to FFFF
 1 × 32 pairs for the other BMP non-characters
 1 × 32 pairs for the supplementary plane non-characters.

This then frees up non-characters for the application's use.

Richard.

_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Mon Jun 02 2014 - 18:35:47 CDT

This archive was generated by hypermail 2.2.0 : Mon Jun 02 2014 - 18:35:47 CDT