From: Ted Hopp (ted@newslate.com)
Date: Thu Jul 31 2003 - 18:02:04 EDT
On Thursday, July 31, 2003 4:56 PM, John Cowan wrote:
> Unicode allows any combining character to be attached to any base
character
> whatsoever. However, putting a dagesh into a DEVANAGARI KA, or placing a
> circumflex over an ARABIC MEEM, is pretty certain to cause bad rendering,
and
> may screw up other text processes such as syllabication.
From Unicode 3.2, Chapter 8 [regarding shin and sin dot]:
"The two dots are mutually exclusive. The base letter shin can also have
dagesh, a vowel, and other diacritics. Use of the two dots with any other
base character is an error."
Sometimes, doing something that's allowed can still be an error.
> > Would FB4B continue to decompose into 05D5 05B9?
>
> Yes. Normalization stability requires it.
That's what I thought.
> > It seems to me that either I'm misinterpreting things, or most people in
> > this discussion would prefer a new combining character to a new base
> > character. If this is so, I'd appreciate an explanation of why, because
I
> > don't understand it.
>
> Assertions of the form "Mark X is only used with base form Y" have proven
to
> be false too often in the past.
All the more reason to avoid introducing more marks.
Ted
Ted Hopp, Ph.D.
ZigZag, Inc.
ted@newSLATE.com
+1-301-990-7453
newSLATE is your personal learning workspace
...on the web at http://www.newSLATE.com/
This archive was generated by hypermail 2.1.5 : Thu Jul 31 2003 - 18:42:40 EDT