From: 'Stephane Bortzmeyer' (bortzmeyer@nic.fr)
Date: Mon Oct 06 2003 - 06:37:44 CST
On Mon, Oct 06, 2003 at 01:52:26PM +0200,
Marco Cimarosti <marco.cimarosti@essetre.it> wrote
a message of 51 lines which said:
> a word like "élite" is always counted as five characters, regardless
> that it might be encoded as six Unicode "characters".
I assume that everybody on this list knows that you count characters
only after a proper normalization... (like many operations on Unicode
texts).
> 3) That is a very silly count anyway. If you want to have an idea of the
> "size" of a document, lines or words are much more useful units.
Tell that to the editor (editors of paper publications still talk with
this unit "3 000 characters, no more, for tommorrow morning").
> OK. But the length in "characters" of a string is not "character semantics":
> it's plain nonsense, IMHO.
I disagree.
This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST