Re: Exemplar Characters

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue Nov 15 2005 - 10:51:59 CST

  • Next message: Philippe Verdy: "Re: Exemplar Characters"

    From: "Chris Harvey" <chris@languagegeek.com>
    >> CLDR uses the correct orientation for apostrophes. It also contains
    >> mapping information so that someone wanting to use or allow fallback
    >> characters such as the ASCII apostrophe can do so. So {c’h} would be
    >> used.
    >
    > A quick question on apostrophes

    This raises another question: the apostrophe is needed in Bereton both as
    part of a necessary alphabetic letter, and with the apheretic role, like in
    French (from which it imports lots of words) and English. The Unicode
    character has then two distinct roles.

    This means that listing {c’h} only implies the first role to create the
    letter, but not the second role where it is not considered part of examplar
    or auxiliary characters (French and English examplar and auxiliary
    characters do not list the apostrophe, despite it has an effective
    grammatical role and is needed for correct orthography).

    How to solve such ambiguity? The current definition of examplar and
    auxiliary characters does not clearly state the role of this kind of
    grammatical/non alphabetic character. How can an application (for example a
    plain-text search indexer) can consider the various ways to encode the
    grammatical and alphabetic apostrophe, which are typically written with the
    same set of alternatives ?

    Note that the apostrophe also has a third function as a quoting punctuation
    which is neither alphabetic and neither grammatical (true at least in French
    and English and many european languages...).

    Most often, there's no difference of interpretation in plain texts between a
    quote and an apostrophe (even if the second one is normally prefered, but
    much less used than the single ASCII quote). The difference really appears
    only in non-plain texts such as program sources as a upper-layer syntaxic
    delimiter, specific to the source language, and that separates this source
    from plain-text character sequences or strings.

    In plain texts, the difference between the ASCII single quotes, apostrophe
    letters, grammatical apostrophes, and quotation marks is most often glyphic
    only (the ASCII quote being the most ambiguous, but the curly apostrophe
    being ambiguous as well, as it only eliminates the right/left subdistinction
    in the punctuation role but does not eliminate the distinction of the three
    roles).



    This archive was generated by hypermail 2.1.5 : Tue Nov 15 2005 - 10:54:00 CST