Re: Character Sequences of Uncertain Rendering (was: Version linking?) from Richard Wordingham via Unicode on 2017-08-26 (Unicode Mail List Archive)

From: Richard Wordingham via Unicode <unicode_at_unicode.org>
Date: Sun, 27 Aug 2017 05:06:04 +0100

On Sat, 26 Aug 2017 21:52:19 +0200
Philippe Verdy via Unicode <unicode_at_unicode.org> wrote:

> 2017-08-26 21:28 GMT+02:00 Richard Wordingham via Unicode <
> unicode_at_unicode.org>:

> Of course SHY in this use is not suitable, but who knows if one will
> not need this to split in tow parts what would be otherwise a single
> cluster (possibly reordered by canonical reordering if one needs to
> split between two Indic matras: this would suggest there's a need for
> a new "empty base consonnant" for that Indic script, but SHY (U+00AD)
> should probably not have the correct effect if it also inserts an
> undesired line break opportunity, independantly of how the glyph
> which would be rendered and the position (first or second line) where
> it would be rendered if the linebreak is honored).

I am confused as to what conceivable case you have in mind. An example
would help. I wonder if I'm misunderstanding what you mean by
'canonical reordering'. Do you mean the order of codepoints, or the
arrangement of glyphs. CGJ is available to preserve a specific
ordering of codepoints, though it is completely redundant in most Indic
scripts.

It is a fact that aksharas do get split between lines in manuscripts,
undesirable though it may be. In a transcription intended to preserve
a division into lines, one would probably use NBSP at such a point,
and worry less about attempting to preserve the structure of the
line-broken akshara. It seems that Unicode only supports word
boundaries and their absence where they provide or prohibit line
breaks.

Richard.
Received on Sat Aug 26 2017 - 23:06:33 CDT

This archive was generated by hypermail 2.2.0 : Sat Aug 26 2017 - 23:06:33 CDT