RE: Generic base characters

From: Michael Maxwell (mmaxwell@casl.umd.edu)
Date: Mon Jul 16 2007 - 18:49:34 CDT

  • Next message: Kenneth Whistler: "RE: Generic base characters"

    I wrote:
    > Because when we are entering Indic script text
    > (for example), I have found it very helpful to
    > have something obvious appear on the screen that
    > indicates I made a mistake at the level of the script.

    To which Kent Karlsson replied:
    > There is no error at "the level of the script".

    I thought my meaning would be obvious, but apparently I was wrong.

    By "error at the level of the script", I meant having a dependent character without any character for it to be dependent on. A diacritic not preceded by a base character, or a Bengali (etc.) dependent vowel sign not preceded by a Bengali consonant.

    One can imagine living in a parallel universe in which Unicode (and ISCII) represented Bengali (etc.) vowel signs and vowel letters as alternative glyphs of a single character/ code point. In that case, I suppose a sequence of Bengali characters MA + O + O + O would be rendered as Bengali 'M' with the 'O' vowel sign to its left and right, followed by two 'O' vowel letter glyphs. (That's just a guess on my part of what the appropriate behavior would be, based on other vowel sequences I've seen in Bengali--which are typed as vowel sign followed by vowel letter.)

    But I don't (and I suspect you don't) live in that universe. I live in a universe in which vowel signs and vowel letters are distinguished in Unicode as distinct code points. And so in my universe a sequence of vowel signs is just as bad as a diacritic without a base character, and it doesn't require a spell checker to know that. Hence an error at the level of the script (OK, to be technical, the script as implemented in Unicode/ ISCII).

    Putting it differently, a sequence of vowel signs would be just as bad in any other language using the Bengali (etc.) script--Assamese, say. Whereas a spell checker would be particular to a certain language (and probably to a single writing system for that language).

      Mike Maxwell
      CASL/ U MD



    This archive was generated by hypermail 2.1.5 : Mon Jul 16 2007 - 18:51:07 CDT