Request for Information

Doug Ewell doug at
Sat Jul 26 10:26:21 CDT 2014

fantasai <fantasai dot lists at inkedblade dot net> wrote:

> I think when you have no further context, it is better to have
> a guess informed by the character properties than one completely
> ignorant of them.

Some of the responses on this list already demonstrate a real risk of 
Unicode adding a property like this. When Unicode publishes this sort of 
data, even if it is meant to be informative, people tend to treat it as 
normative and rigid, and applying to all imaginable scenarios.

So even for a script like Latin, where the customary method of 
justification is usually straightforward, you can have reasonable 
counterexamples like Fraktur as described by Asmus. And then someone 
might bring up a case where the rules might be different for different 
languages (Philippe sort of alluded to this with Arabic). And then there 
will be a historic example from the dawn of printing, and one from a 
highly styled advertising sign, and so forth, and it will be hard to 
tell when the "normal usage" line has been crossed. If necessary, 
someone will trudge out Latin letters on a neon sign, oriented normally 
but written vertically down the sign. Meanwhile Unicode will be 
criticized for not taking all the special cases into account.

It's a bit like the locale collections (CLDR is not alone here) that 
specify a single date format for an entire country, as if all Americans 
only ever write a short date as "m/dd/yy" and anyone who uses a 
different format is employing some sort of weird hybrid system. The 
presence of "m/dd/yy" in the locale collection appears normative and 
rigid, and is often implemented in software as though that were the 
intent, even if the data is meant to be descriptive and a first 

Doug Ewell | Thornton, CO, USA | @DougEwell ­ 

More information about the Unicode mailing list