Re: Devanagari

From: Geoffrey Waigh (gpw@uniserve.com)
Date: Mon Jan 21 2002 - 00:37:23 EST


On Sun, 20 Jan 2002, Aman Chawla wrote:

> Taking the extra links into account the sizes are:
> English: 10.4 Kb
> Devanagari: 15.0 Kb
> Thus the Dev. page is 1.44 times the Eng. page. For sites providing archives
> of documents/manuscripts (in plain text) in Devanagari, this factor could be
> as high as approx. 3 using UTF-8 and around 1 using ISCII.

Well a trivial adjustment is to use UTF-16 to store your documents if you
know they are going to be predominantly Devangari. Or if you have so much
text that the number of extra disks is going to be painful, use SCSU to
bring it very close to the ISCII ratio. Of course I would note that you
can store millions of pages of plain-text on a single harddisk these
days. If you going to be storing so many hundreds of millions of pages of
plain text that the number of extra disks is a bother, I am amazed that
none of it might be outside the ISCII repetoire. And this huge document
archive has no graphics component to go with it...

But the real reason for publishing the data in Unicode on the web is so
people not using a machine specially configured for ISCII will still be
able to read and process the data.

[then later wrote:]

> With regards to South Asia, where the most widely used modems are
> approx. 14 kbps, maybe some 36 kbps and rarely 56 kbps, where
> broadband/DSL is mostly unheard of, efficiency in data transmission is
> of paramount importance... how can we convince the south asian user to
> create websites in an encoding that would make his client's 14 kbps
> modem as effective (rather, ineffective) as a 4.6 kbps modem?

Can you read 500 characters per second? So long as they are receiving
only plain text, even this dwaddling speed is not going to impact them.
People wanting to efficiently transfer data will use a compression
program.

Geoffrey



This archive was generated by hypermail 2.1.2 : Mon Jan 21 2002 - 00:07:39 EST