Re: Plane 14 language tags

From: Antoine Leca (Antoine.Leca@renault.fr)
Date: Thu Jun 29 2000 - 04:23:38 EDT


Murray Sargent wrote:
>
> Note that in C, it's essentially just as fast to make character comparisons
> with (ch | 0x20) as with ch alone, i.e., if you know ch is in an ASCII range
> (0 - 0x7F or 0xE0000 - 0xE007F), you can do a case insensitive compare as
> quickly as a case sensitive one.

Also note that Plane 14 tags are stored in surrogate form when UTF-16
is used (which happens quite often on some well known operating system).
So they are stored using (WCHAR_T[2])({0xDB40, 0xDC00 - 0xDC7F}).
So the | with 0x20 should be done *only* on the second surrogate
codepoint, because if done on the first, result will be offseted by
0x8000 (0xE8020 - 0xE803F and 0xE8060 - 0xE807F).

Antoine



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT