Re: Orthographies using ZWNJ (was: Displaying control characters)

From: Asmus Freytag (
Date: Mon Jul 23 2007 - 22:06:41 CDT

  • Next message: Philippe Verdy: "RE: Orthographies using ZWNJ (was: Displaying control characters)"

    On 7/23/2007 7:20 PM, Philippe Verdy wrote:
    > Karl Pentzlin wrote:
    >> Am Sonntag, 22. Juli 2007 um 20:47 schrieb Philippe Verdy:
    >> PV> SHY is a perfect example of an explicit syllable break.
    >> No. There may be a significant correlation between syllabe breaks and
    >> the places where SHY is applicable in several orthographies, but it is
    >> not 100% e.g for the pre-reform German spelling (e.g. for "Liste"
    >> (list) the syllabes are "Lis/te" but using SHY you had to spell
    >> "Li<SHY>ste").
    > This does not make any difference,
    This makes all the difference in the definition of the character in
    > for any Unicode application,if some
    > language has such tricks that it allows hyphenation on places that are not
    > phonetic/morphologic syllable breaks; they are only concerned by the place
    > where such hyphenation occurs.
    This sounds very much like you are trying to redefine reality here.
    Unicode is about mapping reality to character coding.

    > For Unicode applications like renderers,
    > hyphenation/line breaking candidate places are the only thing important, and
    > they will treat SHY as an explicit syllable break in all cases (even if it
    > is misplaced by the author of the text), exactly like they treat SPACE as a
    > word separator.
    A SHY is not a syllable delimiter. Unicode applications that treat it as
    one do so without backing from the standard.
    > Unicode does not have to handle spelling errors.
    Following standard German orthography is not a spelling error. The
    particular example has changed in the recent reform, but that's actually
    sad news in a way: it's a caving in to badly written software. Just like
    the Spanish retreat from their traditional ordering.

    The important point that Karl makes is that syllable breaks and
    allowable word-break points are not identical.

    In addition to the situation where they occur in different location,
    there are syllable breaks that should not be used as line breaks,
    whether for aesthetic or other reasons.

    1) it is often undesirable to separate single letter syllables from the
    rest of the word, as in "i-dealistisch". The preference on avoiding
    such cases may not always rise to a formal orthographic level, but some
    cases are also formally proscribed.

    2) there are some places where a word separation will lead the reader
    astray. For example, the word "In-stinkt" could be split as indicated,
    but the word "Urinstinkt" cannot be split "Urin-stinkt" ... ;-) (The
    second half of the German word would be misread as the present tense of
    a verb for which the English equivalent is very similar in spelling, and
    the first part is equally smelly in both languages to allow English
    readers to get the problem.
    >> PV> I saw this concern when replying to the message sent by Karl Pentzlin
    >> PV> speaking about the compound word "Schilfinsel" (i.e. "Schilf" +
    >> "Insel"
    >> PV> without a "fi" ligature), that he wants to encode as
    >> "Schilf<ZWNJ>insel",
    >> PV> where the absence of ligature is expected to really mark the internal
    >> PV> syllable break.
    >> The absence of ligature is plainly wrong at most of the syllabe breaks
    >> (if ligatures are used at all). E.g., "offen" (open) requires the "ff"
    >> ligature although the syllabe separation is of-fen. Thus, the spelling
    >> "of<ZWNJ>fen" would be an orthographic error (while "of<SHY>fen" would be
    >> correct).
    >> (Ligatures are not to be used at the part borders of compound words,
    >> with some exceptions, and for some grammatical suffixes, also with some
    >> exceptions.)
    > I also know all this. I already said that SHY did not prohibit ligatures
    > like in "...f<SHY>f...". (May be you have not read).
    I think Karl read quite well - Phillipe's initial posts were highly
    unclear, so giving an explicit example here is helpful. Syllable breaks
    and ligatures are unrelated.
    > Still nothing explained about my true question: since the beginning I am
    > interested in the effective difference of ZWNJ and WJ, and I still cannot
    > see any difference.
    > And in your case, "Shilf?insel" I am still not convinced that ZWNJ is the
    > appropriate character to insert, given that this is a compound word where a
    > ligature should not be drawn between "f" and "i". I expect that ZWNJ will be
    > used even in the middle of a syllable (or even in the middle of a grapheme
    > cluster). And I am wondering if WJ would not be more correct.
    Rest assured, the WJ would be quite incorrect. The fact that you keep
    repeating this indicates that you did not read the standard or any of my
    other posts.
    > I cited SHY in the middle of this discussions because of its possible
    > interactions, but also because SHY does not prohibit a ligature (it has no
    > visible effect except when a line break effectively occurs in the middle of
    > a word where SHY is placed), unlike ZWNJ (and WJ?).
    > My first question remains. ZWNJ or WJ ?
    Never WJ.

    Neither SHY nor WJ "should" affect ligatures, although poor
    implementations might forget to filter either one so they end up
    preventing ligatures by causing the lookup to fail.

    We covered that already, no need to try to restart the issue.

    This archive was generated by hypermail 2.1.5 : Mon Jul 23 2007 - 22:08:38 CDT