I've been transcribing some Pali text written on palm leaf in the Tai Tham script. I'm looking for a way of reflecting the line boundaries in a manuscript in a transcription. The problem is that lines sometimes start or end with an isolated spacing mark. I want my text to be searchable and therefore encoded in Unicode. (I appreciate that There is a trade-off between searchability and showing line boundaries. The unorthodox spelling is also a problem.) How unreasonable is it for a font to render <NBSP, ZWJ, U+25CC DOTTED CIRCLE, spacing_mark> as just the spacing mark? Some rendering systems give the font no way of distinguishing dotted circles in the backing store from dotted circles added by the renderer, so this technique is not Unicode compliant. An alternative solution is to have a parallel font (or, more neatly, a feature) that renders some base character (or sequence) as a zero-width non-inking character. This, however, would violate that character's identity. I suspect there is no Unicode-compliant solution. Richard.
This archive was generated by hypermail 2.2.0 : Sun Jul 21 2019 - 22:53:30 CDT