Re: mixed-script writing systems

From: Jim Allan (
Date: Fri Nov 15 2002 - 15:59:11 EST

  • Next message: Kenneth Whistler: "Re: mixed-script writing systems"

    Peter Constable posted on Wakhi:

    > So, the question is this: Should we say that this writing system is
    > completely Latin (keeping the norm that orthographic writing systems use a
    > single script) and apply the principle of unification -- across languages
    > but not across scripts -- to imply that we need to encode new characters,
    > Latin delta, Latin theta and Latin yeru? Or, do we say that this writing
    > system is only *mostly* Latin-based, and that it mixes in a few characters
    > from other scripts?

    There are quite a number of fonts available that present Latin letters,
    Greek letters and Cyrillic letters in matching styles.

    If the extra characters were encoded separately would they be available
    in as many fonts? If not, (and certainly very few Latin Letter fonts
    encode all the Latin letters in Unicode) I would expect that that those
      entering Wakhi text continue to use the Greek and Cyrillic characters
    instead of the supposedly proper ones because of the wider number of
    fonts which could be used.

    There is little purpose in encoding Latin letter clones of Greek and
    Cyrillic characters if in practice the Greek and Cyrillic originals will
    continue to be the ones normally used.

    Of course, Ezh and even Schwa is missing from most Latin letter fonts
    and Dze from most sets of Cyrillic characters included in fonts
    containing basic Latin, Greek and Cyrillic letters.

    Yet I note the schwa used in the sample does not match the other vowel
    letters in style or width, apparently here borrowed from a different font.

    This brings up the question of whether the sample is in other ways a
    typographical compromise. I have seen popular books of linguistics that
    for typographical reasons used a Greek gamma in place of the IPA gamma
    and a delta in place of the symbol.

    Could this be happening here?

    Even if so, if a typographical compromise has often occurred it could
    have been forgotten in time that it was originally a compromise, and the
    substituted symbols might now be thought to be the correct ones. In that
      case, they indeed they now are correct ones.

    Since seemingly all the characters needed for Wakhi are already encoded
    in Unicode, though some must be taken from non-Latin scripts, I would
    think the matter best left pending information from actual users of the
    Wakhi writing system as to their own desires.

    Note that the traditional Latin letter transliteration of Avestan
    includes the Greek letters theta and chi.

    A reason for adding clones of all Greek letters used in mostly Latin
    character sets is that stylistically Wakhi, Avestan, IPA will appear
    incorrectly if an application is using a font in which the Greek lower
    case letters are in an "italic" style, as is sometimes the case.

    But Unicode generally avoids considering such stylistic matters, noting
    only that not all fonts are suitable for all uses.

    As to sortation, that is also something the Unicode writing usually
    claims is outside its purview, other than that the standard provides an
    optional default sort which might be useful for characters that fall
    outside the characters covered by particular sort conventions.

    Jim Allan

    This archive was generated by hypermail 2.1.5 : Fri Nov 15 2002 - 16:46:05 EST