Combining character example

Philippe Verdy verdy_p at
Thu Apr 16 07:11:37 CDT 2015

2015-04-16 11:32 GMT+02:00 "Jörg Knappen" <jknappen at>:

> Digraphs (e.g. the "ou" in words borrowed from French) also have either a
> single line
> under the whole digraph or (this happens rarely) a single dot in the
> middle of the
> digraph.

The Standard French digraph "ou" (or "oû") /u/ is never long or the length
is not significant, it was significant in old French or remains significant
in some regional variants of French such as Acadian French).

If we need the single dot, it is not really to represent the /u/ vowel but
the /w/ semi-vowel. For example, compare:
- "mouette" /mwɛt/ with a single phonetic syllable, we would note the dot
below the "ou" digraph to indicate this is a half-vowel /w/
- "brouette" /bʀu.ɛt/ with two phonetic syllables, where we would note the
single line below the "ou" digraph to indicate this is a /u/ vowel

And for such use, the distinction between /u/ and /w/ is definitely NOT
"rare" in French (for each one of its variants) !


There's another French digraph using the /w/ half-vowel: "oi" (or "oî" in a
few words like "boîte" where the circumflex denotes an old etymological "s"
that is now completely mute; such circumflex over "i" or "û" is now
optional in most words except if this orthographically disambiguates
homophones, such as "du" vs. "dû"). The "oi" digraph is now read as the
diphtong /wa/ in Standard French (but as the diphtong /wɛ/ or just the
vowel /ɛ/ in old French or in some regional variants). When it was used as
a verbal desinence it is now consistantly written /ai/ and spelled as a
single vowel /ɛ/ without a diphtong.

In all cases, Standard French no longer has any phonetic distinction of
vowel length, it also no longer has any distinction of stress, or tone.
And orthograpĥically, vowel length, or stress, or tone, is never written
(there's no standard diacritic for them).

So the spoken language can freely alter these phonetic variations without
changing the meaning (e.g. in poestry or songs, where this gives much more
freedom for authors or interprets), except for emphasis purposes  or in
extremely rare cases for the spoken language only.
In the written form, if needed, the distinction for these variations is
made using typographic styles (you could mark it by bold, or underlined
styles for emphasis), or by using separators or punctuations; to
distinguish a single digraph from a pair of vowels, the diaresis diacritic
is used orthographically, over one of the two vowels: traditionally the
diaeresis ("tréma" in French) was hold by the second vowel (but it could be
over the first vowel if the second vowel already as another diacritic such
as an acute accent), and in reformed orthoghraphy this is the first vowel
that consistantly holds the diaeresis.


Another possible usage of the "dot below" diacritic in French Text with the
German Duden notation would be to denote the "unaspirated h" (which is
completely mute in all contexts, allows liaisons and contractions, and
sometimes even diphtongs to appear with a preceding vowel in fast speech by
merging two syllables (e.g. "cohabiter" / is possibly muted to
/ in fast speech).

The "low line" (or "low macron"?) below "h" with the German Duden
notation would denote the "aspirated" h, which is now *also* mute in
Standard French (except that it prohibits all phonetic "liaisons" with the
final consonants a previous word, as well as contractions of a previous
article or preposition). The "aspirated h" may however be emphatically
pronounced /h/ (and it is still the norm in regional variants of French).
But traditionnally, French dictionaries denote the "aspirated h" (which
only exists at start of words) with a leading asterisk symbol (or with a
similar symbol such as a bullet) before the orthograhic word entry; very
few use the low line (or low macron) diacritic which is not enough visible.

For such use, the dot below would definitely not be "rare" (even if it
won't be in the middle of a digraph but below a single mute "h" letter).


You could also note with these diacritics the main difference of
pronounciation of "ch":
- /k/ traditionally mostly for most words with Greek etymology (such as
"choriste") would use a dot below the mute "h", or
- /ʃ/ for other words (like "machine" in French and English, but compare
with the Latin expression "Deus ex machina" which is still pronouncing
"machina" with /k/ like in Greek!) would use a line below the whole digraph
(in both cases, it does not denote tone, stress, or length/gemination of
the consonnant).

Length/gemination of consonnants in Standard French is no longer
significant orally; it just persists orthographically (or in some regional
variants), and speakers can freely alter it if they want for marking
emphasis; in written form, they would use typographical styles (such as
bold, underlining, or capitals in the middle of words, or bigger font
sizes) or would insert additional separators such as hyphens or middle dots.

Some words in French still hesitate between the two main pronounciations
/k/ and /ʃ/ of "ch" (e.g. "chorizo" borrowed directly from Spanish into
French, where it means the same kind of dried hot-spiced sausage).

A few words borrowed from English are also rarely pronounced with /tʃ/ but
more often /ʃ/, and notably those words that English itself borrowed from
French with minor orthographic changes, before they came back again to
French. /tʃ/ is just for some "purists" who want to maintain the English
distinction, but for most users it is not incorrect and even recommended to
mute it back to the standard French /ʃ/ (including for English people name
such as "Prince Charles", or for English toponomyms like "Chicago"). It an
author wanted to annotate a French-written text to mark where "ch" should
be pronounced /tʃ/,  he could insert an additional "t" letter between
parentheses or in superscript, or another custom diacritic over the "ch"
digraph or one of its letters (there's no orthographic standard for such

The rare French words where this phonetic mutation of /tʃ/ to /ʃ/ is
prohibited, are written explicitly with the trigram "tch" (e.g. "Tchad",
the African country or lake ; or the interjection "Tchin !", to contrast it
phonetically from "Chine", the East-Asian country, or "chine", a verbal
form of "chiner", both never pronounced with /tʃ/ in French)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Unicode mailing list