From: Jeroen Ruigrok van der Werven (asmodai@in-nomine.org)
Date: Tue Jul 08 2008 - 00:51:27 CDT
-On [20080707 21:52], William J Poser (wjposer@ldc.upenn.edu) wrote:
>There seem to be religious views on this question, but my own practice is
>to use UTF-32 internally in almost all cases. Yes, it takes more memory
>than UTF-8, but the modest additional memory usage doesn't really matter
>much. On the other hand, dealing with UTF-32 is much easier and less error
>prone than dealing with UTF-8. Every four bytes is a character. You can do
>simple array arithmetic, simple calculations of how much memory you need
>to allocate, etc.
We recently tested this with Trac and a Python with 2-byte and 4-byte
storage. Additional memory consumption was less than 5% for this web
application. And given it's an issue tracker with integrated wiki it uses a
lot of strings.
-- Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai イェルーン ラウフロック ヴァン デル ウェルヴェン http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B Man is the Dream of the dolphin...
This archive was generated by hypermail 2.1.5 : Tue Jul 08 2008 - 00:54:27 CDT