From: Mark Davis ☕ (mark@macchiato.com)
Date: Mon Feb 14 2011 - 14:59:43 CST
What I have to hand is both (a) element content, and (b) raw data (converted
to Unicode, of course).
So the following counts as "的"=1 under (a), but under (b) would yield "p"=2,
"<"=2, ...
<p class="foo">的</p>
Mark
On Mon, Feb 14, 2011 at 12:17, Eric Muller <emuller@adobe.com> wrote:
> Are you looking at the text nodes of the HTML (after space normalization)
> or at the HTML serialization ? E.g. do you count the space in "<p
> class="foo">" ?
>
This archive was generated by hypermail 2.1.5 : Mon Feb 14 2011 - 15:03:19 CST