From: Kent Karlsson (kent.karlsson14@comhem.se)
Date: Tue Jul 17 2007 - 04:33:40 CDT
Michael Maxwell wrote:
I wrote:
> Because when we are entering Indic script text
> (for example), I have found it very helpful to
> have something obvious appear on the screen that
> indicates I made a mistake at the level of the script.
To which Kent Karlsson replied:
> There is no error at "the level of the script".
I thought my meaning would be obvious, but apparently I was wrong.
By "error at the level of the script", I meant having a dependent character without any character for it to be dependent on.
In that case there is an implicit NBSP base.
A diacritic not preceded by a base character, or a Bengali (etc.) dependent vowel sign not preceded by a Bengali consonant.
(Assuming this case is meant to be disjoint from the previous case:) But there is still a base character for it. The base needn't be
in the Bengali script. Or there may be other combining characters (Bengali or not) between the base and the considered instance of a
combining character.
One can imagine living in a parallel universe in which Unicode (and ISCII) represented Bengali (etc.) vowel signs and vowel letters
as alternative glyphs of a single character/ code point.
That would be just plain wrong. The combining ("dependent") vowels and the independent vowels look different, and behave
differently. It is NOT a matter of glyph variation.
In that case, I suppose a sequence of Bengali characters MA + O + O + O would be rendered as Bengali 'M' with the 'O' vowel sign to
its left and right, followed by two 'O' vowel letter glyphs.
That is a completely different kind of character string than the ones we are talking about.
(That's just a guess on my part of what the appropriate behavior would be, based on other vowel sequences I've seen in
Bengali--which are typed as vowel sign followed by vowel letter.)
But I don't (and I suspect you don't) live in that universe. I live in a universe in which vowel signs and vowel letters are
distinguished in Unicode as distinct code points. And so in my universe a sequence of vowel signs is just as bad as a diacritic
without a base character,
In that case there is an implicit NBSP base. Note that you can have multiple diacritics applied to a base character.
and it doesn't require a spell checker to know that. Hence an error at the level of the script (OK, to be technical, the script as
implemented in Unicode/ ISCII).
Deviating from the most common (or official) application of a script does not constitute an "error at the level of the script". If
I write moooose, I deviate from the common (official) application of the Latin script (and you can detect that without using a spell
checker). That does not make it an error "at the level of the script". That argument does not change just because the vowel
characters are combining characters
/kent k.
Putting it differently, a sequence of vowel signs would be just as bad in any other language using the Bengali (etc.)
script--Assamese, say. Whereas a spell checker would be particular to a certain language (and probably to a single writing system
for that language).
Mike Maxwell
CASL/ U MD
This archive was generated by hypermail 2.1.5 : Tue Jul 17 2007 - 04:35:31 CDT