From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Mar 13 2011 - 16:06:33 CST
2011/3/13 Doug Ewell <doug@ewellic.org>:
> Philippe Verdy wrote:
>
>>> Modifying all existing electronic text to include such an invisible
>>> control character,
>>
>> Why « all » texts ? This was not in the proposal.
>
>>> and requiring all users and processes to enter it reliably,
>>
>> Why « all » users ? Here again not in the proposal. In fact all characters
>> are encoded for an undefined number of users, possibly small, but not for
>> all users. The existence of the character would be there for those users for
>> whom the difference does matter.
>
> If users or processes who want to take advantage of this special character
> cannot depend on it being there in all texts, it may as well not be there at
> all, as they will have to fall back on the same heuristics that they are
> trying to avoid.
>
> In any case, I'd best get out of the business of telling users like QSJN UKR
> that such-and-so character would be a bad idea or that Unicode will not
> encode it, even if that is what I personally believe.
This is ia chicken-and-egg problem. If you follow this path of
reasoning, let's just stop discussing any further progress or
additions in Unicode. Without any doubt, we would still be using ASCII
for almost everything in Latin, and all texts would have remained
ambiguous.
There's a wellknown problem, but no volonty to propose a solution for
it. Telling people to not use any case mapping in their encoded texts
is just a way to tell them: don't use a standard Unicode algorithm,
i.e. the same as breaking the standard itself by making it unusable
for practical problems.
I don't follow you there. A new character offers a clean long term
solution, even if there will be a long time during which texts encoded
without it will still be present (but they can be corrected at any
time for all occurences where the absence of the explicit combining
char would cause problems.) Even if Unicode is there now and widely
deployed, all past texts using ASCII only have not disappeared, and in
the same way, we still see texts using a single dash for unrelated
things, the same ASCII double quote encoded for distinct quotes.
I'm not advocating the addition of new letters, when just a couple of
combining characters to mark explicitly the expected semantic of case,
can solve all this for all pluricameral scripts in all their cased
letters.
This archive was generated by hypermail 2.1.5 : Sun Mar 13 2011 - 16:09:01 CST