RE: Reviewing IETF documents

From: Ayers, Mike (Mike_Ayers@bmc.com)
Date: Mon Apr 16 2001 - 12:18:50 EDT

Next message: Becker, Joseph: "FW: Learn more about Windows XP's international features"
Previous message: David Starner: "Re: Identifiers"
Maybe in reply to: Florian Weimer: "Reviewing IETF documents"
Next in thread: Florian Weimer: "Re: Reviewing IETF documents"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

<DougEwell2@cs.com>
I hope that the claim of "multiple UTF-8 representations"
does indeed refer
to glyphs, in the sense that Unicode contains both
precomposed characters and
separable elements, halfwidth and fullwidth ASCII variants,
etc. I hope it
does *not* refer to the nonconformant practice of
representing Unicode
characters with "non-shortest" UTF-8 sequences. Instances of
that are not
the fault of UTF-8.
</DougEwell2@cs.com>

Is there an existing set of recommendations for dealing with this
problem (multiple legal compositions) in search and search-like
applications? Specifically, if there are multiple legal ways to represent a
character, how should the character be stored, should search text be
preprocessede, etc.? Pointers, anyone?

TiA,

/|/|ike

Next message: Becker, Joseph: "FW: Learn more about Windows XP's international features"
Previous message: David Starner: "Re: Identifiers"
Maybe in reply to: Florian Weimer: "Reviewing IETF documents"
Next in thread: Florian Weimer: "Re: Reviewing IETF documents"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT