RE: FAQ

From: Yves Arrouye (Yves@centraal.com)
Date: Wed May 19 1999 - 14:18:10 EDT

Next message: Takashi Moriyama: "NT RAID 341920 JPN: TS: No need scripts for VJE-Delta any more"
Previous message: mark.davis@us.ibm.com: "RE: FAQ"
Maybe in reply to: mark.davis@us.ibm.com: "RE: FAQ"
Next in thread: Yves Arrouye: "RE: FAQ"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> - If the storage is UTF-16, then UTF-16 indices are direct.
> To compute UCS-4
> indices you parse from the start of the text.
> - If the storage is UCS-4, then UCS-4 indices are direct. To
> compute UTF-16
> indices you parse from the start of the text.
> - Supporting surrogate pairs does not require using UCS-4 indices.
>
> Here is a simple example of a routine that accesses surrogate
> pairs with UTF-16
> indices, and returns them as UCS-4 characters (here called UTF-32):

Ok. It works in a loop, but you can't provide random-access to the string,
right? Suppose I have, stored on 16 bits, accessible through an str
variable:

s o m e <s1> <s2> t e x t <s1> <s2>

(<s1> <s2> is a surrogate pair). I do have 12 words of useful information,
and only 10 characters. So when I say:

str.getAt(7)

and I mean the 8th character, not the 8th word of storage, I do need to walk
the string in order to get 'x' and not 'e'. The indices don't seem direct
then.

Yves.

Next message: Takashi Moriyama: "NT RAID 341920 JPN: TS: No need scripts for VJE-Delta any more"
Previous message: mark.davis@us.ibm.com: "RE: FAQ"
Maybe in reply to: mark.davis@us.ibm.com: "RE: FAQ"
Next in thread: Yves Arrouye: "RE: FAQ"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT