From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Jan 21 2005 - 09:52:11 CST
On 21/01/2005 14:56, Arcane Jill wrote:
> What with all the BOM difficulties, and the fact that U+FEFF doubles
> up as ZERO WIDTH NO-BREAK SPACE, a new possibility occured to me.
>
> Imagine if the codepoint U+D7FD were reserved as NOP, having
> properties which essentially made it completely ignorable and
> invisible. It could simply be thrown away, whereever it were encounted.
>
Interesting idea, Jill. But would it not be easier simply to redefine
the properties of U+FEFF so that it is effectively a NOP? I know the
name cannot be changed, but I think the relevant properties can be. This
would of course affect a small number of existing texts which make use
of the non-breaking properties of U+FEFF and have not switched to the
preferred WORD JOINER. But there are precedents for such changes in
properties which break deprecated uses of characters.
The great advantage of this is that it requires no changes to current
software for recognising and converting between encoding schemes.
This does not actually affect the fact that there is a distinction
between the BOM signature and the encoded representation of the
character U+FEFF, it just means that a failure to make the distinction
has no practical effect. Note also that processes would not be allowed
to delete or insert U+FEFF or any other NOP if this is actually a
character. Or perhaps they could if a new NOP character could be made
canonically equivalent to the null string, would that be possible?
Of course it would be possible to redefine the encoding schemes such
that the encoded representation of U+D7FD is not interpreted as a
character at all but is discarded on decoding. But this is something
different, and more disruptive.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.7.1 - Release Date: 19/01/2005
This archive was generated by hypermail 2.1.5 : Fri Jan 21 2005 - 10:35:27 CST