From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Fri Mar 19 2010 - 11:41:17 CST
On 3/19/2010 2:51 AM, Andrew West wrote:
> On 18 March 2010 21:57, karl williamson <public@khwilliamson.com> wrote:
>
>> Section 4.8 of TUS 5.2 says:
>>
>> R3
>> U+002D hyphen-minus does not occur ... immediately preceding a space
>> character.
>>
>> But these characters violate that:
>> U+0F0A TIBETAN MARK BKA- SHOG YIG MGO
>> U+0FD0 TIBETAN MARK BSKA- SHOG GI MGO RGYAN
>> U+0FD0 TIBETAN MARK BKA- SHOG GI MGO RGYAN
>>
>
> They certainly do conform to the character naming rules, but the
> "Character Name Syntax" rules -- which seem to have been newly added
> to the text of Unicode 5.2 -- misstate the rules with regard to the
> hyphen-minus character. The text of the forthcoming 2010 edition of
> ISO/IEC 10646 explicitly allows hyphen-minus immediately preceding a
> space:
>
> 24.2 Name formation
> An entity names shall consist only of the following characters
> • LATIN CAPITAL LETTER A through LATIN CAPITAL LETTER Z,
> • DIGIT ZERO through DIGIT NINE,
> • SPACE,
> • HYPHEN-MINUS, and
> • FULL STOP if the entity being named is a collection
> The first character in an entity name shall be a Latin capital letter.
>
The actual rules for *character* names have always been that the
first character in a *word* (i.e. following a space, or start of name)
must be a Latin capital letter, except that hyphen-minus may start any
word but the first.
I've not been aware that this was changed deliberately, so to me, the
above statement of the rules seem to contain an editing mistake resulting
from their recent reformulation.
> The last character in an entity name shall be either a Latin capital
> letter or a Digit.
>
This seems to needlessly rule out a hypothetical
TIBETAN LETTER A-
While this may not occur as a part of Tibetan characters
as far as they have been encompassed, it looks like an
unnecessary restriction in the face of future naming
requirements for this and other scripts.
However, there might be a strong technical reason for
this restriction, which hasn't occurred to me yet. In
that case, I'm sure someone here can enlighten me.
> An entity name shall not contain two or more consecutive SPACE
> characters or consecutive HYPHEN-MINUS characters. A collection name
> shall not contain two or more consecutive FULL STOP characters.
> A sequence of a SPACE followed by a HYPHEN-MINUS or a sequence of a
> HYPHEN-MINUS followed by a SPACE may appear only in character names or
> named UCS sequence identifiers.
> EXAMPLE 1 Each of the following two character names contains a
> consecutive SPACE and HYPHEN-MINUS:
> TIBETAN LETTER -A
> TIBETAN MARK BKA- SHOG YIG MGO
>
> Andrew
>
>
>
>
This archive was generated by hypermail 2.1.5 : Fri Mar 19 2010 - 11:44:35 CST