Re: 3rd-party cross-platform UTF-8 support

From: Tom Emerson (tree@basistech.com)
Date: Mon Sep 24 2001 - 17:45:41 EDT


Michael \(michka\) Kaplan writes:
> > To find character n I have to walk all of the 16-bit values in that
> > string accounting for surrogates. If I use UTF-32 I don't need to do
> > that. This very issue came up during the discussion of how to handle
> > surrogates in Python.
>
> Would this not be the same issue for composite characters, even *in* UTF-32?

Yes, absolutely. However, in the case of Python we were concerned with
being able to access a surrogate as a valid assigned single character.

> If you truly mean to work with characters here then it seems this is a
> problem you can always have.

Of course.

    -tree

-- 
Tom Emerson                                          Basis Technology Corp.
Sr. Sinostringologist                              http://www.basistech.com
  "Beware the lollipop of mediocrity: lick it once and you suck forever"



This archive was generated by hypermail 2.1.2 : Mon Sep 24 2001 - 16:32:12 EDT