Hello Richard,
On 2015/07/15 16:49, Richard Wordingham wrote:
> What mark-up schemes exist to show that a sequence of letters and
> combining marks constitutes a single word?
>
> Such mark-up would be useful when using spell checkers. At present, I
> use U+2060 WORD JOINER (WJ) to indicate the absence of a word boundary.
> (Systematic marking of boundaries using ZWSP is not popular with
> users, and is normally not used in Thai - it's not supported in
> their national or Windows 8-bit encodings.) However, it seems likely
> that when Unicode 8.00 is defined in August, WJ will suppress line
> breaks but not word breaks. There would still be the limitation that
> mark-up is not available in plain text.
>
> It appears that, for example, Open Document Format has no mark-up to
> indicate word boundaries, relying instead on the overrides of
> the word boundary detection algorithms being stored at character level.
I'd suggest looking at higher-end formats such as DITA or TEI (Text
Encoding Initiative).
Regards, Martin.
> Richard.
> .
>
Received on Wed Jul 15 2015 - 06:19:19 CDT
This archive was generated by hypermail 2.2.0 : Wed Jul 15 2015 - 06:19:20 CDT