From: Dominikus Scherkl (Dominikus.Scherkl@glueckkanja.com)
Date: Wed Oct 30 2002 - 06:49:04 EST
Hello.
I would like to have a "source failure indicator symbol" (SFIS)
charakter in the unicode, which a charset-convertion unit may
insert into a text (Suggeested position: U+FFF8).
Reason:
several charsets have undefined codepoints which were
defined in a former or later version (eg. overlong
UTF-8 encodings or the $ symbol (0x24) in the INVARIANT
charset).
A converter can replace such symbols by U+FFFD (which is
correct but loses the information), or simply use the
charakter which most likely is intended (which hides the error).
Both is not very good.
The SFIS would allow the reader to see that an error occured
and therefore the following charakter may be incorrect, but
maintain the readability if the right conversion is made anyway
(or at least give a hint which charakter may be intended -
eg. the $ sign could have been any other currency symbol
if a national 7-bit charset was changed to INVARIANT by
previous conversions).
Of course a converter can still use U+FFFD if it has no
idea which character is intended or if unicode doesn't contain
the character.
The whole "charakter identities"-discussion gave me another
reason to introduce such a SFIS-charakter:
A font-renderer may show the SFIS before a charakter which
is replaced by another one because the correct one is not
contained in the font (eg. it may render an "a with
superscript e above" by SFIS + "a umlaut" to indcate the
error and show an probably fitting replacement, which is
much better than to show an empty square).
In short words:
The SFIS may indicate a kind of compatibility-decomposition
of the following charakter.
(this is not nessessarily the standard compatibility-decomposition).
I'd like to hear if my suggestion is completely weird or
if anybody else think it might be useful.
Best Regards.
-- Dominikus Scherkl dominikus.scherkl@glueckkanja.com
This archive was generated by hypermail 2.1.5 : Wed Oct 30 2002 - 07:24:56 EST