Re: regular expressions

From: Alain LaBont/e'/ (alb@sct.gouv.qc.ca)
Date: Wed Feb 05 1997 - 10:15:54 EST


At 15:19 97-02-04 -0800, John Cowan wrote:
>Alain LaBonté wrote:
>
>> That's the idea of a traditional locale. However in ISO/IEC CD 14651 we
>> completely redefine the comparison operation (we introduce the concept of
>> equivalence at different levels of comparison) and that allows you to do all
>> that with a single operation without changing locales...
>>
>> Hence "La Bonté" is equivalent to "labonte" at level 1, it is not at level 2
>> or higher... it is equivalent to "labonté" at level 2, it is not at level 3
>> or higher (note that there is a space in the reference, none in the
>> comparand, that is intentional), it is equivalent to "LaBonté" at level 3
>> and it is not at higher levels and finally it is absolutely identical (or
>> equivalent at level 4) to "La Bonté" (note that the standard makes this
>> code-independent too!)
>
>How about locales that use two different alphabets? Historically
>in Yugoslavia the strings "znati" (= to know) and
>"\x0437\x043D\x0430\x0442\x0438" were taken as "the same thing"
>in a fairly strong sense: a manuscript received by a publisher
>in Latin script might wind up being typeset in Cyrillic.
>(I don't know if Latin orthography is deprecated in
>Yugoslavia these days.) This certainly would not be
>true of Russian or English text.

That's up to them. If they want to intermix two-script letters in their
national order they can do so by tailoring. The default will delimit the two
scripts though, which does not mean that both can't be used in the same
field, though, we have fully-predictable provisions for this even in the
default. That is fully mastered.

Alain LaBonté (version : 8 bits --- (-: )
Alain LaBont/e'/ (version : 7 bits --- )<:= !@#$%?&*()_+-=^~',."!!!)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT