Kenneth Whistler wrote:
> The raw figures are posted below.
Thanks.
> These constitute the lumped sums from both the MUMS Books database and
> the JACKPHY database, containing 12,421,528 instances of characters with
> diacritics, out of a total of 1,492,948,727 Latin characters.
BTW, the JACKPHY database (IIRC) is bibliographic information (in Latin
alphabet transliteration) for books written in non-Latin scripts.
So it represents "non-native" uses of diacritics.
An interesting point about ANSEL is that it treats u-horn and o-horn
as unique letters like eth and ae, rather than as u and o with a
COMBINING HORN as Unicode does. Since HORN is not applied to any
other letters, I wonder why it was analyzed out by the Unicode
designers (only saved 3 codepoints).
--Schlingt dreifach einen Kreis vom dies! || John Cowan <jcowan@reutershealth.com> Schliesst euer Aug vor heiliger Schau, || http://www.reutershealth.com Denn er genoss vom Honig-Tau, || http://www.ccil.org/~cowan Und trank die Milch vom Paradies. -- Coleridge (tr. Politzer)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:59 EDT