Yes, that is what I said. 
"- If the storage is UTF-16, then UTF-16 indices are direct. To compute UCS-4 indices you parse from the start of the text."
Your example is UTF-16 text, so the UCS-4 indices are *not* direct--accessing a random UCS-4 index requires scanning from the start of the text. Here are the direct UTF-16 indices, plus the UCS-4 indices computed by parsing from the start.
text:    s o m e <s1> <s2> t e x t <s1> <s2>
UTF-16: 0 1 2 3 4    5    6 7 8 9 10   11   12
UCS4:   0 1 2 3 4         5 6 7 8 9         10
So the 8th UCS-4 code value is "x", while the 8th UTF-16 code value is "e".
Does that answer your question?
Mark
----- Original Message ----- 
From: Yves Arrouye <Yves@centraal.com>
To: Unicode List <unicode@unicode.org>
Sent: Wednesday, May 19, 1999 11:13 AM
Subject: RE: FAQ
> 
> > - If the storage is UTF-16, then UTF-16 indices are direct. 
> > To compute UCS-4
> > indices you parse from the start of the text.
> > - If the storage is UCS-4, then UCS-4 indices are direct. To 
> > compute UTF-16
> > indices you parse from the start of the text.
> > - Supporting surrogate pairs does not require using UCS-4 indices.
> > 
> > Here is a simple example of a routine that accesses surrogate 
> > pairs with UTF-16
> > indices, and returns them as UCS-4 characters (here called UTF-32):
> 
> Ok. It works in a loop, but you can't provide random-access to the string,
> right? Suppose I have, stored on 16 bits, accessible through an str
> variable:
> 
> s o m e <s1> <s2> t e x t <s1> <s2>
> 
> (<s1> <s2> is a surrogate pair). I do have 12 words of useful information,
> and only 10 characters. So when I say:
> 
> str.getAt(7)
> 
> and I mean the 8th character, not the 8th word of storage, I do need to walk
> the string in order to get 'x' and not 'e'. The indices don't seem direct
> then.
> 
> Yves.
> 
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT