Re: Least used parts of BMP.

From: Doug Ewell (doug@ewellic.org)
Date: Sat Jun 05 2010 - 11:38:48 CDT

Next message: William J Poser: "Re: Hexadecimal digits"

Previous message: Doug Ewell: "Re: Overloading Unicode"
In reply to: Philippe Verdy: "RE: Least used parts of BMP."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

>> Of course, he will not have other UTF-8-like features, such as
>> avoidance of ASCII values in the final trail byte, and "fast forward
>> parsing" by looking at the first byte.
>
> The fast forward feature is certianly not decisive, but the random
> acessibility (from any position and in any direction) is certainly
> much more decisive and is a real positive factor for UTF-8, rather
> than the format proposed above, which can only be read in the forward
> direction, even if it can be accessed randomly to find the *next*
> character. to find the *previous* one, you have to scan backward until
> you eat at least one byte used to encode the character before it
> (otherwise, you don't know if a 1xxxxxx byte is the first one in a
> sequence, even if you can know if a byte is the last one.

Kannan is looking for a format for a protocol that he is developing.
Maybe scanning backwards through a string is not a scenario that will
ever be encountered in this protocol. It's not for us to say.

--
Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s

Next message: William J Poser: "Re: Hexadecimal digits"
Previous message: Doug Ewell: "Re: Overloading Unicode"
In reply to: Philippe Verdy: "RE: Least used parts of BMP."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sat Jun 05 2010 - 11:40:31 CDT