From: Addison Phillips (addison@yahoo-inc.com)
Date: Tue Jan 23 2007 - 14:09:42 CST
Marion wrote:
> but the same kind of survey measurements taken in the earliest
> years of Web activity would probably
> yield closer to 100% ASCII, which would have been gravely
> wrong and very misleading in real terms (that is, in terms
> of real needs of real
> users), so it would be, IMHO, better to ignore such statistics and
> always return, as a rule of thumb, to user needs (as distinct to user
> practice, which, in my experience, can often be no more than a
> reflection of colonial imposition, as a culture strives to survive
> against all the odds).
If we were talking about the distribution of *language* text, I would
agree. But the measurement of markup as a relation to actual text is
different. HTML tags and other markup are not language-bearing text and
they consistently form about half the overall "content" of the textual
part (as opposed to graphics or music files and such) of the Web. If we
omit the markup, the range and relative distribution of various scripts
has evolved over time, away from early domination by Latin scripts
towards a more "natural" distribution.
So even if we all switched to cuneiform for writing our various
languages, the total volume of the Web that used supplemental characters
would only approach 50%, since half of the Web is angle brackets and
such :-).
Best Regards,
Addison
-- Addison Phillips Globalization Architect -- Yahoo! Inc. Internationalization is an architecture. It is not a feature.
This archive was generated by hypermail 2.1.5 : Tue Jan 23 2007 - 14:11:52 CST