Re: UAX #29 beta update (text breaks): apostrophe ./. H

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Oct 27 2003 - 17:54:14 CST


----- Original Message -----
From: "Peter Kirk" <peterkirk@qaya.org>
To: "Philippe Verdy" <verdy_p@wanadoo.fr>
Cc: <unicode@unicode.org>
Sent: Tuesday, October 28, 2003 12:16 AM
Subject: Re: UAX #29 beta update (text breaks): apostrophe ./. H

> On 27/10/2003 13:34, Philippe Verdy wrote:
>
> >The proposed update to UAX#29 contains this text:
> >
> >Apostrophe is another tricky case. Usually considered part of one word
("can
> >'t", "aujourd'hui") ...
> >
>
> >...
> >
> >So in French we also have the additional word break rule:
> >
> > hyphens ÷ LatinLetterH
> >
> >This case is not documented...
> >
> >
> >
> This new rule also fails with the original case under consideration,
> aujourd'hui. But then maybe it should fail, as this is presumably a
> contraction of au jour de hui. So, unless we have other cases in French
> of an apostrophe followed by a consonant other than h which should not
> be taken as a word break, the best simple rule is to count apostrophe as
> always a words break. Or maybe a better rule (for French, not Italian)
> would relate to the number of letters in the word before the apostrophe,
> which is always one in the word break examples cited.

I replied with this exception in another message.
"Aujourd'hui" is really an exception. May be there are a few others, but I
can't remember any other example.

I will not consider the number of letters before the apostrophe, simply
because there are other examples of word contractions by an apostrophe,
notably in the written form off the popular language, where this apostrophe
is replacing the end of a word, but without creating a single compound word
with the next word.

There are a few common cases like:

- "lorsqu'il" (two words), contraction of (incorrect) "lorsque il".
- "qu'il" (two words), contractions of (incorrect) "que il".
- "puisqu'il" (two words), contraction of (incorrect) "puisque il".
- "puisqu'elles" (two words), contraction of (incorrect) "puisque elles".
- "just'assez" (two words), rare contraction (used in the written form of
the spoken language) of "juste assez" (the normal way to write it), but also
written sometimes as "just' assez" with an explicit space

But here is a few other exceptions:

- "presqu'īle" (one word), feminine noun, contraction of adjective+noun
"presque-īle"
- "entr'apercevoir" (one word), verb, contraction of prefix+verb
"entre-apercevoir", the prefix replacing the adverb "entre" followed by an
implied object, and meaning "apercevoir (quelquechose) entre (deux choses)";
this is sometimes written just as "entrapercevoir" without the apostrophe.
- "entr'acte" (old form, contraction of "entre-acte", with reference to the
period that occurs "entre deux actes"), now most often written "entracte"
without the apostrophe.



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST