From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Feb 06 2008 - 00:48:58 CST
> -----Message d'origine-----
> De : Asmus Freytag [mailto:asmusf@ix.netcom.com]
> Envoyé : mardi 5 février 2008 19:57
> À : verdy_p@wanadoo.fr
> Cc : 'Hans Aberg'; 'Jeroen Ruigrok van der Werven'; unicode@unicode.org
> Objet : Re: Factor implements 24-bit string type for Unicode support
>
> Phillipe gave some very interesting arguments (complete with specific
> figures)
Actually no. I was not very precise, on purpose. My conclusion was that
there's no evidence that using UTF-32 alone or UTF-8 alone, or any other
variant alone will give a universal performance advantage.
Well, after rereading the message as you received it, I should have
corrected some obvious typos (missing letters, or "é instead of 32 on my
French keyboard). Sorry for the inconvenience, the message has gone too fast
(blame CTRL+Enter in Outlook for sending the message immediately while
editing it...).
> but without citing his evidence or stating the assumptions. A
> thorough comparison of the performance of large data volumes in the
> various encoding forms would be interesting.
My point is that all performance figures will be dependant on the volume of
data to handle and where it is located (or comes from or will go to). That's
because our modern architectures are much more complex now, with data going
through many more datapipes with various performance bottlenecks, and
various internal caches that are impossible to predict in a rule in some
precompiled software.
I don't think that this will ever become simpler in the future, and we'll
have to live with the fact that the computing architecture will necessarily
be heterogeneous, and that software will have to use auto-adaptative
features (that's a good point for the newest architectures based on virtual
machines, that are not optimized too early, but allow autoadaptation on the
final target host on which the software will effectively run).
I gave just some figures about the current case with today's most common
processors (that have internal cache sizes about 64KB to 1MB, but this
threshold is likely to change at anytime, when there's a newer generation of
processor running at even higher frequency, but adding a new stage of data
cache, that will be much more costly, and so, more reduced than the previous
legacy cache system that will persist or will be split into several
subsets).
If processors continue their evolution, we'll soon have models with dozens
of cores, each one having a small datacache, because it won't be possible to
give 1MB to each of them, but they will collaborate by exchanging data
through several pipelines connected to larger caches (but these larger
caches will have more concurrent accesses so they will be slower, creating
the need for the new data cache stage for each core).
The performance penalty of data alignment is likely to disappear in
practice, and it will no longer make a huge difference if you read data byte
per byte or as a whole 32-bit unit over a large bus. In fact you'll
immediately realize that even UTF-32 is misaligned given today's 64-bit
processors, and that internally, processors use even larger internal buses
when communicating with their fastest caches (however this current move may
as well be reversed by reducing the bus width, due to synchronization issues
at very high frequencies; note that RAM technologies are now in favour of
serial 1-bit access instead of parallel buses due to this reason: at very
high frequencies, the exact length of each line becomes extremely important,
and if it's not geometrically possible to ensure that each line will provide
their data at the same time, it will limit the frequency to ensure correct
synchonization of data).
If processors follow what has happened in RAM technologies, they could as
well reduce their internal working bus width, and will work in another way,
using a network of many very small 1-bit cores with high redundancy, and
removal of most of their mutual synchronization mechanisms that also require
more energy to maintain their current state within internal buffering
registers.
May be you'll still be able to program your software using an x86 or IA64
instruction set, but this will just be a virtual program that will be
recompiled and reoptimized locally on the final host. If this happens, the
predicted memory alignment constraints burnt in software will be a thing of
the past. But what will survive is the fact that the "one-size-fits-all"
optimization strategy will no longer apply, as there will be various
compression/decompression steps added everywhere and transparently, as this
will be used for the needed autoadaptation and scalability to more
heterogeneous environments.
Now if you look at the level at which Unicode is specified, it is in terms
of 32-bit code points. This will be the apparent level at which you'll
program things, but it will not dictate the way the data will be effectively
stored in memory or disk or exchanged on the network, using various data
compression steps (or even expansion to larger words than 32-bit!), when and
where needed.
This archive was generated by hypermail 2.1.5 : Wed Feb 06 2008 - 09:38:10 CST