Re: VOWEL, CONSONANT, ...: allow recognition of shorter names?

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Fri Apr 11 2008 - 07:18:02 CDT

  • Next message: Andreas Prilop: "Re: Using combining diacritical marks and non-zero joiners in a name"

    Hello Henrik,

    you have written:
    > When writing a character name recognition algorithm, I would like to
    > let the user be as concise as possible, yet without violating Unicode
    > rules, and without being in potential conflict with upcoming versions
    > of Unicode.

    Here is an idea for a different approach that may be used
    independently from, or in addition to, yours.

    You could allow your users to provide a context for
    the search. If your algorithm knows that the user is
    looking for, e. g., a Khmer character, you could
    - consider only characters used in Khmer;
    - supply the common name-constituents “KHMER LETTER”,
       “KHMER SYMBOL”, “KHMER INDEPENDET VOWEL”, etc.,
       and try all of them, concatenated with the user’s input.

    If the algorithm is to be used, interactively, you could
    conduct a pattern-maching amongst all eligible (in the
    context given, cf. supra) character names, and let the user
    choose amongst the hits.

    If, however, your algorithm is meant as an API (to be used
    from scripts, or programs), this pattern-matching approach is
    less suitable, as a pattern that selects a unique name,
    today, may become ambiguous, with a future version of the
    standard.

    Best wishes,
       Otto Stolz



    This archive was generated by hypermail 2.1.5 : Fri Apr 11 2008 - 07:21:58 CDT