On Mon, 21 Jan 2019 00:29:42 -0800
David Starner via Unicode <unicode_at_unicode.org> wrote:
> The superscripts show a problem with multiple encoding; even if you
> think they should be Unicode superscripts, and they look like Unicode
> superscripts, they might be HTML superscripts. Same thing would happen
> with italics if they were encoded in Unicode.
But if one strips the mark-up out, and searching is then based on
the collation elements of the text, then this is not a problem.
Mathematical and ASCII capitals differ only at the identity level.
Searching on the basis of codepoint sequences would come unstuck with
scriptio continua scripts - WJ and ZWSP can be optionally inserted to
improve line-breaking, and even to overcome spell-checkers.
Richard.
Received on Tue Jan 22 2019 - 18:17:05 CST
This archive was generated by hypermail 2.2.0 : Tue Jan 22 2019 - 18:17:06 CST