From: Andrew West (andrewcwest@gmail.com)
Date: Fri Mar 19 2010 - 03:51:32 CST
On 18 March 2010 21:57, karl williamson <public@khwilliamson.com> wrote:
>
> Section 4.8 of TUS 5.2 says:
>
> R3
> U+002D hyphen-minus does not occur ... immediately preceding a space
> character.
>
> But these characters violate that:
> U+0F0A TIBETAN MARK BKA- SHOG YIG MGO
> U+0FD0 TIBETAN MARK BSKA- SHOG GI MGO RGYAN
> U+0FD0 TIBETAN MARK BKA- SHOG GI MGO RGYAN
They certainly do conform to the character naming rules, but the
"Character Name Syntax" rules -- which seem to have been newly added
to the text of Unicode 5.2 -- misstate the rules with regard to the
hyphen-minus character. The text of the forthcoming 2010 edition of
ISO/IEC 10646 explicitly allows hyphen-minus immediately preceding a
space:
24.2 Name formation
An entity names shall consist only of the following characters
• LATIN CAPITAL LETTER A through LATIN CAPITAL LETTER Z,
• DIGIT ZERO through DIGIT NINE,
• SPACE,
• HYPHEN-MINUS, and
• FULL STOP if the entity being named is a collection
The first character in an entity name shall be a Latin capital letter.
The last character in an entity name shall be either a Latin capital
letter or a Digit.
An entity name shall not contain two or more consecutive SPACE
characters or consecutive HYPHEN-MINUS characters. A collection name
shall not contain two or more consecutive FULL STOP characters.
A sequence of a SPACE followed by a HYPHEN-MINUS or a sequence of a
HYPHEN-MINUS followed by a SPACE may appear only in character names or
named UCS sequence identifiers.
EXAMPLE 1 Each of the following two character names contains a
consecutive SPACE and HYPHEN-MINUS:
TIBETAN LETTER -A
TIBETAN MARK BKA- SHOG YIG MGO
Andrew
This archive was generated by hypermail 2.1.5 : Fri Mar 19 2010 - 03:58:25 CST