Re: Least used parts of BMP.

From: Doug Ewell (doug@ewellic.org)
Date: Sat Jun 05 2010 - 11:38:48 CDT

  • Next message: William J Poser: "Re: Hexadecimal digits"

    Philippe Verdy <verdy underscore p at wanadoo dot fr> wrote:

    >> Of course, he will not have other UTF-8-like features, such as
    >> avoidance of ASCII values in the final trail byte, and "fast forward
    >> parsing" by looking at the first byte.
    >
    > The fast forward feature is certianly not decisive, but the random
    > acessibility (from any position and in any direction) is certainly
    > much more decisive and is a real positive factor for UTF-8, rather
    > than the format proposed above, which can only be read in the forward
    > direction, even if it can be accessed randomly to find the *next*
    > character. to find the *previous* one, you have to scan backward until
    > you eat at least one byte used to encode the character before it
    > (otherwise, you don't know if a 1xxxxxx byte is the first one in a
    > sequence, even if you can know if a byte is the last one.

    Kannan is looking for a format for a protocol that he is developing.
    Maybe scanning backwards through a string is not a scenario that will
    ever be encountered in this protocol. It's not for us to say.

    --
    Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
    RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s ­
    


    This archive was generated by hypermail 2.1.5 : Sat Jun 05 2010 - 11:40:31 CDT