Tag characters and in-line graphics (from Tag characters)
prosfilaes at gmail.com
Wed Jun 3 08:24:04 CDT 2015
> There is no way to compare 2 HTML elements and know they are talking
about the same character
That's because character identity is a hard problem. Is the emoji TIGER the
same as TONY THE TIGER or as TONY THE TIGER GIVING THE VICTORY SIGN?
Note that even in Unicode, the set ẛ ᷥ ſ ṡ s S Ŝ may be considered the
same character or up to seven different characters, depending on
case-folding, canonization and accent dropping.
> Similarly, there is no way to search or index html elements. If a HTML
document contained an image of a particular custom character, there would
be no way to ask google or whatever to find all the documents with that
character. Different documents would represent it differently.
You can index links to images. If two documents represent it differently,
then I go back to the above; we can't know that they're the same thing.
On Tue, Jun 2, 2015 at 7:11 PM Chris <idou747 at gmail.com> wrote:
> You can’t ask the entire computing universe to compress everything all the
Anytime we care about how much space text takes up, it should be
compressed. It compresses very well. On the other hand, it's rare that
anyone cares anymore; what's a few hundred kilobytes between friends?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Unicode