Re: Public Review Issue Update: #100, "Giving U+00B7 MIDDLE DOT the ID_Continue Property"

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Jan 16 2007 - 16:47:55 CST


From: "Kent Karlsson" <kent.karlsson14@comhem.se>
> Kenneth Whistler wrote:
>> However, the particular concern for Catalan arises because of
>> the *canonical* equivalence involving MIDDLE DOT for the
>> characters U+013F LATIN CAPITAL LETTER L WITH MIDDLE DOT
>> and U+0140 LATIN SMALL LETTER L WITH MIDDLE DOT.
>
> Which canonical equivalence? There is a *compatibility* equivalence
> involving these characters, though...

It is *near canonical* because Catalan has lots of texts encoded with ISO-8859-1, where L WITH MIDDLE DOT does not exist, but MIDDLE DOT is used instead after L. With the simple mapping from isO-8859-1 to Unicode, Catalan text appears with a MIDDLE DOT acting as a spacing diacritic.

But there are other spacing diacritics (or letter modifiers) which may eventually be part of orthographies, including:

U+005E ^ (ASCII circumflex, high caret symbol) -> preferably use U+02C6 (spacing circumflex)
U+0060 ` (ASCII backquote symbol, or spacing grave accent)
U+007E ~ (ASCII low tilde, symbol) -> preferably use U+02DC (spacing high tilde)
U+00A8 ¨ (spacing diaeresis)
U+00AF ¯ (spacing macron above) -> preferably use U+02C9 (macron letter modifier)
U+00B4 ´ (spacing acute)
U+00B8 ¸ (spacing cedilla)
U+02C7 ˇ (spacing caron)
etc...

So why making something specific for the middle dot when it is just a particular a spacing diacritic or letter modifier, and not for the spacing backquote, ASCII tilde...

And why not the apostrophe apostrophes? They are part of the orthographies of unbreakable words, for example "aujourd’hui" is now a single word in French, that can't be broken at the apostrophe (on the opposite "d’une" is two words with a required elision, but it is also unbreakable at the apostrophe).

What about the backward apostrophe used in some romanized languages where it noxw transliterates a alef or similar letters of a previous script like Arabic?



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:55:40 CST