From: Kent Karlsson (kent.karlsson14@telia.com)
Date: Fri Aug 13 2010 - 04:28:51 CDT
Den 2010-08-13 02.28, skrev "Pravin Satpute" <psatpute@redhat.com>:
> Yes, problem is happening only when these characters come at initial
> position.
> i.e U+0951 and U+0952 in isolation should render with U+25cc
U+25CC should never be inserted automatically. That some systems do so is a
bug (no matter how consciously it was made). (I know, there are some Indic
script characters that should have had a canonical decomposition but don't
have one; using what should have been the canonical decomposition should
then be marked somehow in rendering, but using a dotted circle in not the
way to do that I think).
>> "Inherited" means that the character inherits its Script property from
>> the preceding character(s), so if either of the stress signs is preceded
>> by a Devanagari character, it should make no difference whether the
>> stress sign itself is categorized as Devanagari or Inherited.
>
> looks good, but hmm its really hard to guess characters script when it
> will be alone.
> I think one need to add extra check, when character will be at initial
> position with property inherited
When a combining character sequence is ill-formed ("at the initial
position"), it should be rendered *as if* applied to an NBSP (regardless
of script).
http://www.unicode.org/versions/Unicode5.2.0/ch05.pdf, section 5.13:
"Defective combining character sequences should be rendered as if they had
a no-break space as a base character. (See Section 7.9, Combining Marks.)"
http://www.unicode.org/versions/Unicode5.2.0/ch07.pdf, section 7.9:
"Marks as Spacing Characters. By convention, combining marks may be
exhibited in (apparent) isolation by applying them to U+00A0 no-break
space."
/Kent K
This archive was generated by hypermail 2.1.5 : Fri Aug 13 2010 - 04:33:50 CDT