Corrigendum #9

"Martin J. Dürst" duerst at
Tue Jun 3 02:09:27 CDT 2014

On 2014/06/03 07:08, Asmus Freytag wrote:
> On 6/2/2014 2:53 PM, Markus Scherer wrote:
>> On Mon, Jun 2, 2014 at 1:32 PM, David Starner <prosfilaes at
>> <mailto:prosfilaes at>> wrote:
>>     I would especially discourage any web browser from handling
>>     these; they're noncharacters used for unknown purposes that are
>>     undisplayable and if used carelessly for their stated purpose, can
>>     probably trigger serious bugs in some lamebrained utility.
>> I don't expect "handling these" in web browsers and lamebrained
>> utilities. I expect "treat like unassigned code points".

Expecting them to be treated like unassigned code points shows that 
their use is a bad idea: Since when does the Unicode Consortium use 
unassigned code points (and the like) in plain sight?

> I can't shake the suspicion that Corrigendum #9 is not actually solving
> a general problem, but is a special favor to CLDR as being run by
> insiders, and in the process muddying the waters for everyone else.

I have to fully agree with Asmus, Richard, Shawn and others that the use 
of non-characters in CLDR is a very bad and dangerous example.

However convenient the misuse of some of these codepoints in CLDR may 
be, it sets a very bad example for everybody else. Unicode itself should 
not just be twice as careful with the use of its own codepoints, but 10 
times as careful.

I'd strongly suggest that completely independent of when and how 
Corrigendum #9 gets tweaked or fixed, a quick and firm plan gets worked 
out for how to get rid of these codepoints in CLDR data. The sooner, the 

Regards,   Martin.

More information about the Unicode mailing list