"Martin J. Dürst"
duerst at it.aoyama.ac.jp
Tue Jun 3 02:09:27 CDT 2014
On 2014/06/03 07:08, Asmus Freytag wrote:
> On 6/2/2014 2:53 PM, Markus Scherer wrote:
>> On Mon, Jun 2, 2014 at 1:32 PM, David Starner <prosfilaes at gmail.com
>> <mailto:prosfilaes at gmail.com>> wrote:
>> I would especially discourage any web browser from handling
>> these; they're noncharacters used for unknown purposes that are
>> undisplayable and if used carelessly for their stated purpose, can
>> probably trigger serious bugs in some lamebrained utility.
>> I don't expect "handling these" in web browsers and lamebrained
>> utilities. I expect "treat like unassigned code points".
Expecting them to be treated like unassigned code points shows that
their use is a bad idea: Since when does the Unicode Consortium use
unassigned code points (and the like) in plain sight?
> I can't shake the suspicion that Corrigendum #9 is not actually solving
> a general problem, but is a special favor to CLDR as being run by
> insiders, and in the process muddying the waters for everyone else.
I have to fully agree with Asmus, Richard, Shawn and others that the use
of non-characters in CLDR is a very bad and dangerous example.
However convenient the misuse of some of these codepoints in CLDR may
be, it sets a very bad example for everybody else. Unicode itself should
not just be twice as careful with the use of its own codepoints, but 10
times as careful.
I'd strongly suggest that completely independent of when and how
Corrigendum #9 gets tweaked or fixed, a quick and firm plan gets worked
out for how to get rid of these codepoints in CLDR data. The sooner, the
More information about the Unicode