Re: 3rd-party cross-platform UTF-8 support

From: Michael \(michka\) Kaplan (michka@trigeminal.com)
Date: Mon Sep 24 2001 - 15:50:56 EDT


From: "Tom Emerson" <tree@basistech.com>

> But if I have a text string, and that string is encoded in UTF-16, and
> I want to access Unicode character values, then I cannot index that
> string in constant time.
>
> To find character n I have to walk all of the 16-bit values in that
> string accounting for surrogates. If I use UTF-32 I don't need to do
> that. This very issue came up during the discussion of how to handle
> surrogates in Python.

Would this not be the same issue for composite characters, even *in* UTF-32?
If you truly mean to work with characters here then it seems this is a
problem you can always have.

MichKa

Michael Kaplan
Trigeminal Software, Inc.
http://www.trigeminal.com/



This archive was generated by hypermail 2.1.2 : Mon Sep 24 2001 - 14:42:37 EDT