From: William J Poser (wjposer@ldc.upenn.edu)
Date: Wed Feb 06 2008 - 20:22:37 CST
The mention of the issue of whether Georgian encodes in UTF-8 as two
bytes or three bytes is yet another instance of something that puzzles me.
Why do some people on this list seem to care so much about text size?
We now have such large storage devices, so much memory, and such high
network bandwidth, that it strikes me as very odd that anyone would care
very much about modest differences in the size of texts. Primary memory
is a bit less plentiful and using less can improve performance, so
the question of what representation to use for processing makes some
sense, but why people would care about how large a text is in UTF-8,
which is primarily intended for storage and transfer, mystifies me.
So, is this an essentially outdated obsession that some people have
not been able to shake? Are there people here working on applications
with so much text that modest differences in size are important? Are
some of you working in very restrictive environments such as embedded
systems or satellites? For whom does it really make a difference
how many bytes the UTF-8 encoding of their script requires?
Bill
This archive was generated by hypermail 2.1.5 : Wed Feb 06 2008 - 20:26:07 CST