RE: Tentative Definition of Casefolding

From: Keutgen, Walter (walter.keutgen@be.unisys.com)
Date: Wed Jun 14 2006 - 11:11:57 CDT

  • Next message: Jeroen Ruigrok/asmodai: "Re: Tentative Definition of Casefolding"

    Richard, Jeroen,

    _Title Casing_

    in continental European languages, if possible at all, title casing requires knowledge of the grammatical status of the words in the title.

    Surnames with words beginning with a lower case letter would not have this letter capitalized, except as 1st letter of the title being a sentence of its own. You mentioned "ffrench"; there are all Dutch names with "van" and others as "van der Biest"; Belgian Dutch names include separate lower case words only if these words are nobiliary particles like in French and German ("van", "de", "von")

    In Dutch, a few names begin with an apostrophe followed by a lower case letter. The Dutch and Flemish obviously consider that the _apostrophe_ is capitalized. Examples: "'t Kindt" (Belgian surname), "'s-Gravenhage" (better known as "Den Haag" = "The Hague", capital of the Netherlands), "'s-Hertogenbosch" (= "Duke Town", better known as "Den Bosch", province capital of North Brabant)

    Is it really the aim of Unicode to cover this all for the benefit of some universal routine? I doubt.

    Meanwhile we get maps published on the web where dumb title casing is applied to our streets, whilst it would be so simple to keep them as typed in. But this seems to be linked to the fact that we will need to get accustomed to use miles, inches and °F again instead of km, cm and °C.
    --------------

    _Upper Casing_

    More interesting is upper casing. In real titles, the nobiliary particles would be capitalized, in directories one could opt for leaving them as they are, to show the noblesse status of the person.

    What about "ff" "Mc" and such?
    --------------

    _Dutch "IJ"_

    This title case is not linked to the proper names. The word "ijzer" (=iron) is written "IJzer" at the begin of a sentence, see: http://nl.wikipedia.org/wiki/IJzer_%28element%29. I regret that Wikipedia uses title case for the entries.

    If I remember well what I have heard from a now retired Dutch colleague and read elsewhere, before the spelling reform of 1946-1947, the ligature "ij" [ει] was a letter on its own, between "i" and "j" in the collation sequence. The ligatures exist in UNICODE, U+0132 (IJ) and U+0133 (ij). Like the French "Œ" and "œ", they were not present in the typewriter. The decision was taken to write henceforth "ij" and "IJ". The "IJ" instead of "Ij" in title case could be the result of a victory of traditionalists (like the ending "isch" instead of "is"). Of course in fine printing with proportionally spaced fonts, "ij" will be ligatured anyway in any language, but for "IJ" I would expect it only in Dutch, not in Flemish. So are the traffic plates. In Flanders "IJzer", in the Netherlands "IJssel" (not the same rivers). The fonts used are usually of Swiss type (vertical, proportionally spaced, without serifs) and the Dutch "IJ" nicely looks like a "U" with a hole at the bottom of the left vertical stroke. No
    te that in Flanders geographical names underwent the entire reform, whereas in the Netherlands they did not ("Den Bosch" instead of "Den Bos"). Surnames got their ligatures split. Note: When Dutches and Flemish use each others geographical names they keep them in their original spelling.

    When writing in another language I would not apply this Dutch title case rule except for surnames, at least where their spelling has a legal value.
    --------------

    _Slang_ (from another contribution)

    Of course Unicode can be used to write slang, the typewriter could also :-).

    Best regards

    Walter Keutgen
    Unisys Belgium nv-sa

    THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

    -----Original Message-----
    From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of Jeroen Ruigrok/asmodai
    Sent: Monday, 12 June 2006 07:48
    To: Richard Wordingham
    Cc: unicode@unicode.org
    Subject: Re: Tentative Definition of Casefolding

    -On [20060611 22:38], Richard Wordingham (richard.wordingham@ntlworld.com) wrote:
    >'ffrench', 'ffinch' and 'ffife' are all English surnames. Just google for
    >them! I didn't mean to cause confusion. I'm not aware of this phenomenon
    >in any other language, though surnames beginning with a uncapitalised
    >grammatical word are common enough.

    In Dutch names starting with ij are written IJ and not Ij. Say a city like
    Capelle aan den IJssel, or a name like IJzersmid, et cetera.
    Not sure if this also causes problems for what you are dissecting.

    Realistically we should be using the special glyph, but almost everyone I know
    doesn't even realise we have a specialised glyph for this.

    -- 
    Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
    イェルーン ラウフロック ヴァン デル ウェルヴェン
    http://www.in-nomine.org/
    Only in sleep can one find salvation that resembles Death...
    


    This archive was generated by hypermail 2.1.5 : Wed Jun 14 2006 - 11:37:36 CDT