From: Doug Ewell (doug@ewellic.org)
Date: Sat Jun 05 2010 - 11:38:48 CDT
Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:
>> Of course, he will not have other UTF-8-like features, such as
>> avoidance of ASCII values in the final trail byte, and "fast forward
>> parsing" by looking at the first byte.
>
> The fast forward feature is certianly not decisive, but the random
> acessibility (from any position and in any direction) is certainly
> much more decisive and is a real positive factor for UTF-8, rather
> than the format proposed above, which can only be read in the forward
> direction, even if it can be accessed randomly to find the *next*
> character. to find the *previous* one, you have to scan backward until
> you eat at least one byte used to encode the character before it
> (otherwise, you don't know if a 1xxxxxx byte is the first one in a
> sequence, even if you can know if a byte is the last one.
Kannan is looking for a format for a protocol that he is developing.
Maybe scanning backwards through a string is not a scenario that will
ever be encountered in this protocol. It's not for us to say.
-- Doug Ewell | Thornton, Colorado, USA | http://www.ewellic.org RFC 5645, 4645, UTN #14 | ietf-languages @ http://is.gd/2kf0s
This archive was generated by hypermail 2.1.5 : Sat Jun 05 2010 - 11:40:31 CDT