From: Peter Constable (petercon@microsoft.com)
Date: Thu Dec 06 2007 - 10:51:09 CST
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
> Behalf Of Karl Pentzlin
Reply in opposite order:
> b.) Why U+FD3E and U+FD3F have the Bidi_mirroring property not set?
IIRC, this is by design for back-compat reasons. I believe it has been discussed on this list before.
> This leads to my questions:
> a.) Why U+FD3E has GC property Ps and U+FD3F has Pe, and not vice
> versa?
Good question. Primary usage with Arabic seems to suggest vice versa. Mind, since in principle they can be used in either direction, something neutral such as Po might make sense. A key question to consider is what derived properties and algorithms would be affected by a change. For instance, switching Ps/Pe values for these characters would have a follow-on effect for line breaking:
Current
FD3E gc=Ps, lb=OP
FD3F gc=Pe, lb=CL
If changed:
FD3E gc=Pe, lb=CL
FD3F gc=Ps, lb=OP
That would result in a significant change in line-breaking behaviour, though it would probably be an improvement for use in Arabic text (and detrimental for use in LTR text). But changing to a neutral category such as Po would have far more substantial impact on line breaking since both would have lb=AL; in particular, neither would behave particularly like closing punctuation.
There are no contingent line-breaking properties -- break this way for RTL but that way for LTR. So, there's no way to assign properties to these characters that provide the desired behaviour in all scenarios. Since -- at least, for line breaking -- a tailoring is needed to do the right thing in all cases, perhaps there's not a lot of value in changing the properties.
Peter
This archive was generated by hypermail 2.1.5 : Thu Dec 06 2007 - 10:54:39 CST