From: "Tom Emerson" <tree@basistech.com>
> But if I have a text string, and that string is encoded in UTF-16, and
> I want to access Unicode character values, then I cannot index that
> string in constant time.
>
> To find character n I have to walk all of the 16-bit values in that
> string accounting for surrogates. If I use UTF-32 I don't need to do
> that. This very issue came up during the discussion of how to handle
> surrogates in Python.
Would this not be the same issue for composite characters, even *in* UTF-32?
If you truly mean to work with characters here then it seems this is a
problem you can always have.
MichKa
Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/
This archive was generated by hypermail 2.1.2 : Mon Sep 24 2001 - 14:42:37 EDT