Transform Rule Syntax
cameron at lumoslabs.com
Thu Dec 17 13:19:18 CST 2015
Ah wonderful, thanks Philippe. That's something about regular expressions I
didn't know, but I was able to verify in several programming languages.
On Wed, Dec 16, 2015 at 4:54 PM, Philippe Verdy <verdy_p at wanadoo.fr> wrote:
> When a dash-hyphen "-" appears as the first character within an inclusive
> (or negative) character class, just after "[" (or after "[^" in a negative
> class), it does not denote a range separator, but itself literally as being
> part of the inclusive character class (or being excludedfrom the negative
> This is how most regexp engines treat it, and you don't need to escape it
> (with a "\").
> So "[-\ ]" is the character class containing only the dash-hyphen and the
> space (which needs to be escaped in CLDR rules because whitespaces are
> relaxed, as you noted), and it has NO range.
> <https://www.avast.com/?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Cet
> e-mail a été envoyé depuis un ordinateur protégé par Avast.
> 2015-12-17 1:25 GMT+01:00 Cameron Dutro <cameron at lumoslabs.com>:
>> Hey cldr-users,
>> I'm working with the CLDR transform rules and finding myself flummoxed.
>> Specifically I'm looking at this rule
>> in the es-es_FONIPA transform rule set. In this rule, we see what appears
>> to be a Unicode set or character class from a regular expression: [-\ ]
>> Either way, this does not appear to be valid syntax. Hyphens are used in
>> character classes to denote ranges of characters, for example [a-z].
>> Literal hyphens must be escaped. The hyphen in question is neither part of
>> a range nor escaped. Why is this? Finally, it appears the character class
>> contains an escaped space character. Space characters are not required to
>> be escaped in character classes.
>> My suspicion is that this syntax is to be treated in a special way since
>> it is used in the context of transformation rules. Please let me know if
>> this is the case. I have been unable to find any documentation regarding
>> the special treatment of hyphens in UTS #35 or other documents.
>> CLDR-Users mailing list
>> CLDR-Users at unicode.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CLDR-Users