From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Apr 15 2004 - 14:16:33 EDT
From: "Patrick Andries" <Patrick.Andries@xcential.com>
> Anto'nio Martins-Tuva'lkin a écrit :
> >>However I advise removal of the note "Catalan" under U+0140 and
> >>U+013F, and perhaps replacement of the whole note with «for Catalan
> >>use U+006C U+00B7» (resp. U+004C).
> >>
> Did you get an answer on this ? Why is there no decomposition associated
> to this character ?
>
> Also did somewhat mention why U+0140 is even in Unicode since it could
> be considered (by ignorami like me) as a precomposed character (l +
> middle dot) ? Is it due to the polysemy of the middle dot ?
I thought it was already answered in this list by a Catalan speaking
contributor: the sequence L+middle-dot in Catalan is NOT a combining sequence.
The middle dot in Catalan plays a role similar to an hyphen between syllables,
to mark a distinction with words where, for example a double-L would create an
alternate reading. The dot indicates that each L must be read distinctly (or
read with a long or emphatic L).
In French for example we have words like "maille" to be read as /maj/, and the
same "-ill-" written diphtongs after another vowel occur in Catalan. But French
will not write "-ill-" if it occurs between two vowels where the two L must have
the sound L (if this occurs in french, only 1 L is written, and the
emphatic/long sound is not marked). Catalan has this orthograph, and writes the
emphatic/long L distinctly. So it needs a symbol for that. The middle-dot is
then considered in Catalan as a letter, that will occur in the middle of words.
I don't know if the middle-dot can be used in Catalan as a cadidate position for
a line break with hyphenation: if yes, is it kept before the hyphen, or is the
middle-dot used alone, or is the middle-dot replaced by a regular hyphen? I
don't know. But if the middle-dot must be replaced by a hyphen, then it is a
punctuation (similar to hyphens used in compound-words).
But in Catalan, the middle dot should not be kerned into the preceding uppercase
L, like it would appear if it was considered equivalent to <L-middle-dot>.
Catalan has no use of such decomposition, and if such decomposition had existed,
it would have been into L + combining left-middle-dot, and not the same
character.
If there's something really missing for Catalan, it's a middle-dot letter with
general category "Lo", and combining class 0 (i.e. NOT combining). It's
unfortunate that almost all legacy Catalan text transcoded to Unicode are based
on the middle-dot symbol (the one mapped in ISO-8859-1 and ISO-8859-15) which is
not seen by Unicode as a letter (Lo) but as a symbol only.
This archive was generated by hypermail 2.1.5 : Thu Apr 15 2004 - 14:52:03 EDT