Re: Missing capital H from Unicode range (see 1E96)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon Aug 15 2005 - 13:31:06 CDT

  • Next message: Gregg Reynolds: "Re: Missing capital H from Unicode range (see 1E96)"

    From: "Andreas Prilop" <nhtcapri@rrzn-user.uni-hannover.de>
    > On Wed, 6 Jul 2005, Jukka K. Korpela wrote:
    >
    >> As far as I know, the only documented usage for 1E96 is in some
    >> transliteration systems for Semitic languages, such as transliteration of
    >> Arabic according to ISO 233. Although Arabic does not make case
    >> distinction, it is customary and normal to use mixed case in
    >> transliterated words and texts, using e.g. a capital letter at the start
    >> of a proper noun. Thus, I too find it strange that the corresponding
    >> capital letter has no code position in Unicode and that 1E96 has no
    >> uppercase mapping.
    >
    > The ISO and DIN transliteration of Arabic U+062E is U+1E2B
    > "h with breve below", which has an upper-case form U+1E2A.
    >
    > U+1E96 occurs in Anglo-American transliterations - but only in
    > combinations like "kh with line below", which has the upper-case
    > form "Kh with line below". Therefore no "capital H with line below".

    Correct. For example I have seen all these transliterations of this common
    Arabic first name:

    (1) Pedantic variants using modified base H:

    - Ḫaled, ḪALED (the most correct transcription, using capital H with breve
    below at U+1E2A, and there's also a lowercase form at U+1E2B)

    - Haled, HALED (an ASCII-only variant of the previous were the diacritic is
    lost)

    (2) Variants using base K only:

    - Ḳaled, ḲALED (variant with U+1E32, capital K with dot below, seen
    sometimes)

    - Ḳaled, ḲALED (canonical equivalent of the previous using a decomposition
    with combining dot below at U+0323; seen more often as it shows correctly
    using Windows core fonts, without Arial Unicode MS from licenced Office).

    [[*** Notice about previous line: Windows incorrectly render the dot below
    the FOLLOWING letter 'A) instead of the previous (K) if you use the
    "Verdana" font instead of "Arial" or "Times New Roman". It really looks like
    a BUG in the latest version of the Verdana font for Windows XP... where
    U+0323 seems to be handled like if it was mapped to a RTL dot below (the
    Hebrew meteg?). So test this message with your email reader and select an
    alternate font if it happens to you (My preference for Email or the web is
    Verdana in general)... ***]]

    - Ḵaled, ḴALED (acceptable transliteration, with U+1E34, capital K with line
    below, sometimes seen; there's also a lower k with macron below too)

    - Kaled, KALED (same, but the macron or dot below is not marked and this
    orthograph is more common in practice, even though many Europeans forget to
    voice this K correctly, ignoring its original sound).

    (3) Variants using digraph with H after K:

    - Kẖaled, KH̱ALED (incorrect transliteration, using small h with macron
    below at U+1E96, impossible to put in uppercase without decomposition as <H>
    and U+0331).

    - Kḥaled, KḤALED (acceptable transliteration, using small or capital H with
    dot below at U+1E25 or U+1E24)

    - Khaled, KHALED (ASCII only, for French or English, this is the typical
    graphy for this first name). This graphy assumes that the reader knows sound
    associated for the digraph, or can make at least a difference with K, H, or
    R. Else the following variants may be used:

    (4) Other transcriptions:

    - Chaled, CHALED (sometimes seen in German, where CH is pronounced as a
    sharp R).

    - Raled, RALED (sometimes seen in France, where R can be pronounced sharp).

    - H'aled, H'ALED (uses apostrophe after H to denote short unvoiced H as a
    good approximation).



    This archive was generated by hypermail 2.1.5 : Mon Aug 15 2005 - 13:33:16 CDT