From: Bjoern Hoehrmann (derhoermi@gmx.net)
Date: Mon Feb 21 2011 - 15:49:23 CST
* Philippe Verdy wrote:
>And anyway it is also much simpler to understand and easier to
>implement correctly (not like the sample code given here) than SCSU,
>and it is still very highly compressible with standard compression
>algorithms while still allowing very fast processing in memory in its
>decompressed encoded form :
>- a bit faster than UTF-8, as seen in my early benchmarks, for small
>number of large texts such as pages in a Wiki database,
>- but a bit slower for large number of small strings such as tabular
>data, because of the higher number of conditional branches when using
>a CPU with a 1-way instruction pipeline (not a problem with today's
>processors that include a dozen of parallel pipelines even in a single
>core, if the compiled assembly code is correctly optimized and
>scheduled to make use of them when branch-prediction cannot help
>much).
It seems to me from a very very brief look that you can eliminate much
of the conditional logic there in the same manner in which I removed it
in http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ from the UTF-8 decoder
as far as decoding goes (there you could completely eliminate branches,
but it would cost you a register, among other things, as I recall). The
main performance problem I encountered when developing the decoder was
actually compilers being silly...
-- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
This archive was generated by hypermail 2.1.5 : Mon Feb 21 2011 - 15:51:06 CST