Re: Dotted Circle plus Combining Mark as Text

From: Philippe Verdy <verdy_p_at_wanadoo.fr>
Date: Sun, 20 Oct 2013 14:04:05 +0200

2013/10/20 Jukka K. Korpela <jkorpela_at_cs.tut.fi>

> 2013-10-20 2:38, Richard Wordingham wrote:
>
> Is a sequence of a U+25CC DOTTED CIRCLE plus a combining mark plain
>> text?
>>
>
> Well, is <h1>hello<h1> plain text? The answer is that any string of
> characters may be considered as plain text and any string of characters may
> be treated as rich text according to some conventions.
>
>
> If so, how many dotted circles should appear?
>>
>
> Possibly none. An implementation need not support any particular
> collection of characters. But an implementation that supports U+25CC must
> treat it as a spacing character, and an implementation that supports e.g.
> U+0300 must treat it as a combining mark. So if the implemention is capable
> of visually rendering them, it shall render U+25CC U+0300 as a dotted
> circle with an acute accent above it. In this case, exactly one dotted
> circle should appear, then.
>

I also agree that there cn be only zero or one dotted circle.
You'll get zero if no font (or the rednerer internally) supports a mapping
for U+25CC.
But if the font or renderer can map U+25CC to a glyph, it should display it
only once (with or wirhout the combining mark above it).

You only get two dotted circles because the renderer treats U+25CC
separately to render it as a symbol, then it finds the combining mark and
the renderer decides (incorrectly) that it is defective after this symbol,
and so it decides to render the combining mark with an additional glyph
(taken from any font that has a mapping for U+25CC or from an internal
glyph built by the renderer).

The sequence U+25CC + combining mark should never be considered defective
in any script. I.e. the renderer should never insert a dotted glyph before
a combining mark in this context. It it true that U+25CC is foreign from
the Indic scripts where it should be inserted before the true combining
mark of these scripts.

Ideally all fonts that contain glyphs to map combiing marks should have a
mapping for U+25CC to have a glyph with suitable metrics with the metrics
of their combining marks. But then the (OpenType) renderer should not
exclude U+25CC from the script and should treat it as if it was a valid
base letter for all combining marks. U+25CC is not specific to a script it
is part of the "Common" script (not just "Inherited").

So I think that if you see two dotted circles, the (OpenType) renderer has
a bug for its implementation of the script associated with the combininag
mark : its internal rules for that script are too much restrictive.

You can however see two dotted circles ONLY in the case of the **special**
rendering with "visible controls" edit mode (where you would expect to see
exectly one specing glyph for each encoded character, and possibly other
tricks like disabling the Bidi reordering, disabling the mandatory
ligatures). But shuch mode is not the normal rendering for presenting text.
Received on Sun Oct 20 2013 - 07:06:15 CDT

This archive was generated by hypermail 2.2.0 : Sun Oct 20 2013 - 07:06:16 CDT