Re: Case insensitive comparisions

From: Kenneth Whistler (
Date: Tue Mar 28 2000 - 15:29:42 EST

Keld responded to David,

> On Tue, Mar 28, 2000 at 08:24:57AM -0800, wrote:
> > I was stepping through some code that did case insensitive comparison for
> > unicode, and noticed that the heart of the function converted each
> > character to lowercase before comparing them. If they matched, they were
> > considered equal, otherwise not.
> If you do case insensitive comparison, the right thing is to
> do it at the case insensitive level of the ISO/IEC ordering
> standard 14651. Don't do uppercase to lowercase mapping first,
> just compare directlye, case insensitive.
> Keld

This is, indeed, one way to do case insensitive comparison. If you
have access to an implementatation of string ordering according to
the forthcoming standard ISO/IEC 14651, or according to the corresponding
Unicode Standard: UTR #10 Unicode Collation Algorithm, and if that
implementation provides a good API that allows efficient, case insensitive
comparison of two strings according to a particular collation definition,
then this can be a good choice.

However, case folding does not necessarily depend on a collation
algorithm. See also the Unicode Technical Report #21, Case Mappings, for
discussion of case folding and a suggested data file for doing locale-independent
case folding.

There are circumstances under which one definitely does *not* want to
have case folding depend on particular collation tables or on locale
differences in comparison. The explanation and examples are provided
in the Case Mappings technical report.


This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:00 EDT