2012/6/19 Naena Guru <naenaguru_at_gmail.com>:
> Unicode Sinhala:
> http://ahangama.com/sing/DBS.htm (4 kB)
> Romanized Singhala:
> http://ahangama.com/sing/DSS.htm (1 kB)
>
> Compare the shape formation and the sizes of the files. How much bandwidth
> is taken for the Unicode Sinhala file to go as UFT-8? 6kB!
Your stats are grossly skewed. You don't even use UTF-8 to represent
Sinhalese letters, but decimal NCRs like ව in your demo page!!!
That is 7 bytes per character !
Plus you have added a lot of extra indentation spaces in the DBS.html
version (using decimal NCRs) that are not in your hacked DSS.htm page
(which also uses a WOFF font via a CSS style, but even you server does
not conform to the web standards to deliver this WOFF font: incorrect
MIME types).
You are then claiming that some browsers are doing things well and
some others not. But the fault is your's : you don't follow the
standards and browsers have different non interoperable ways to solve
these non standard inconsistencies (they are not wrong if they don't
render your WOFF webfonts, notably if they are not correctly
identified in the HTTP protocol with the correct MIME types).
Start first by auditing your demo pages and solving all warnings
reported by browsers (including those that will prohibit further
optimizations). Your test pages are simple enough that they should be
easy to correct manually. You'll see that the conforming Unicode
version (DBS.htm) can be largely improved).
My browser anyway does NOT render any Sinhalese letter with your
hacked DSS.htm page. But it still DOES render the Unicode version
(DBS.htm) correctly even if it can be improved (remove the extra
spaces and newlines like you did in DSS.htm, and REALLY encode it
using UTF-8 instead of NCRs; plus make sure that CSS stylesheets gets
loaded before any javascript (in both versions).
Fix the MIME types on your server, and finally fix your webserver so
that it obeys the HTTP/1.1 session management (so that proxies used by
mobile networks can correctly use transparent data compression on both
versions : UTF-8 will no longer even be a problem, as generic
compressors will use less than one byte per character on typical
Sinhalese texts that are consistantly encoded in the source).
Received on Tue Jun 19 2012 - 08:25:28 CDT
This archive was generated by hypermail 2.2.0 : Tue Jun 19 2012 - 08:25:29 CDT