From: Kent Karlsson (kent.karlsson14@comhem.se)
Date: Sun Jan 06 2008 - 06:06:24 CST
Andreas Stötzner wrote:
...
> medieval/early-printing ligatures and abbreviations. Extensive research
> has been done on it in the past few years, see e.g.
> http://gandalf.aksis.uib.no/mufi/ ; see the Latin-Ext.-D block of the
> UCS.
The document you refered to there in turn refers to
http://www.mufi.info/specs/MUFI-Alphabetic-2-0.pdf.
That document unfortunately does not seem up to speed with Unicode,
even though it (inaccurately) states "Compliant with the Unicode Standard
version 5.0".
For instance, it allocates to the PUA characters that are already encoded
in non-PUA, albeit as combining sequences rather than single characters.
Just to mention a few (there are MANY more in that document):
LATIN SMALL LETTER A WITH WITH OGONEK AND ACUTE (ignoring the double WITH...)
can and should be represented by
0105;LATIN SMALL LETTER A WITH OGONEK followed by 0301;COMBINING ACUTE ACCENT
(or a sequence canonically equivalent to that).
LATIN SMALL LETTER A WITH DOUBLE ACUTE can and should be represented by
0061;LATIN SMALL LETTER A followed by 030B;COMBINING DOUBLE ACUTE ACCENT.
(etc. for quite a few more accented letters).
In some cases there aren't even combining characters involved: e.g.
LATIN SMALL LIGATURE F O WITH DIAERESIS
This should be represented simpy by LATIN SMALL LETTER F followed
by LATIN SMALL LETTER O WITH DIAERESIS (or a sequence canonically
equivalent to that). The ligature should be formed by the font from that
sequence. The formation should be by default if the font is suitable for
texts with "fö" in them, and typographically there would be an overlap
(or extra spacing) if there was no ligature. (Note that other ligatures
aren't like that, but are akin to (e.g.) the oe ligature, like for
instance tha aa ligature. These should be represented by characters
of their own.)
They also change the formal name of some characters (for various reasons),
e.g. LATIN ABBREVIATION SIGN SMALL ET for TIRONIAN SIGN ET, which is not
helpful. Even if the name is "wrong" (which does not seem to be the case
here) the formal name is used for reference, not to be changed.
/kent k
This archive was generated by hypermail 2.1.5 : Sun Jan 06 2008 - 06:10:44 CST