Re: i with macron over an e - Do U+0365 and U+2071 lose their dot when accented like U+0069?

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Fri Feb 22 2008 - 18:51:48 CST

  • Next message: Karl Pentzlin: "Combining umlauts (e.g. ü over a base letter)"

    Until 5.1.0 is published, the most up to date list of characters with
    the "soft_dotted" property is
    http://www.unicode.org/Public/5.0.0/ucd/PropList.txt

    In it you will find 45 characters listed that have the soft-dotted
    behavior, but they do not include the combining i. The reason for that
    is that in Unicode, you can't apply a diacritic to a diacritic, you can
    only apply a diacritic to a sequence.

    Diacritics may then interact typographically, but ordinarily, if you
    apply a macron to a character with a diacritic the macron is not applied
    to that diacritic. So, it would seem that if an 'i with macron'
    diacritic is needed, that is something that is not encoded yet.

    A macron applied to a sequence of <e , combining dotless i> should be
    rendered as if it applied to the whole. It, what the notation tries to
    express, is the modification of an 'e' by an i-macron, then the
    combining character that is missing would be the i-macron.

    As to whether that character should be encoded, I'm neutral - you would
    need to demonstrate that its use matches the type of precedent for which
    the other superscripted combining letters were encoded.

    It appears that phonetic notations are about the most demanding when it
    comes to character encoding. Even mathematics pales in comparison ;-)

    A./

    On 2/22/2008 3:45 PM, Karl Pentzlin wrote:
    > On p.228 of the printed TUS 5.0 is stated:
    > "Diacritics on i and j. A dotted (normal) i or j followed by a
    > nonspacing mark above loses the dot in rendering."
    >
    > As this paragraph does not explicitly refer to
    > U+0069 LATIN SMALL LETTER I
    > is the conclusion correct that it applies to all Latin small i`s,
    > especially to
    > U+0365 COMBINING LATIN SMALL LETTER I
    > U+2071 SUPERSCRIPT LATIN SMALL LETTER I
    >
    > In other words, is it correct to encode the entity marked in red in
    > the attached scan (showing a dotless i with macron over an e) as
    > U+0065 U+0365 U+0304?
    >
    > Or, if such things should be representable in Unicode, is there
    > a COMBINING LATIN SMALL LETTER DOTLESS I to be proposed?
    >
    > (The scan is from:
    > Hotzenköcherle, Rudolf
    > Einführung in den Sprachatlas der Deutschen Schweiz
    > Einführungsband A
    > Bern 1962, p.53)
    >
    > - Karl Pentzlin
    >
    > ------------------------------------------------------------------------
    >



    This archive was generated by hypermail 2.1.5 : Fri Feb 22 2008 - 18:54:23 CST