Due to network problems, I can read mail at cowan@ccil.org, but
can't post/reply/send from there. Please direct all replies to
cowan@ccil.org, not the HotMail address. Thanks.
Here's yet another proposal for making the case mapping of various
"i" characters consistent. This one is subtly different from
those proposed before, AFAIK. It does continue to have the
problem that conversion from 8859-9 requires some intelligence
to know whether we are converting Turkish or non-Turkish text.
Introduced are two new characters, LATIN CAPITAL LETTER DOTLESS I
and LATIN SMALL LETTER I WITH DOT ABOVE. The uppercase mapping
for LATIN SMALL LETTER DOTLESS I is changed to the new CAPITAL
DOTLESS I. SMALL I WITH DOT ABOVE is a compatibility character:
its canonical decomposition is LATIN SMALL LETTER I plus
COMBINING DOT ABOVE (not DOTLESS I plus DOT ABOVE).
The new CAPITAL DOTLESS I has the same glyphic representation as
CAPITAL I, but is made distinct because of its different
lowercase mapping. There is precedent for this: U+00D0
LATIN CAPITAL LETTER ETH, U+0110 LATIN CAPITAL LETTER D WITH
STROKE, and U+0189 LATIN CAPITAL LETTER AFRICAN D all have
the same glyph, viz. "D" with a horizontal stroke through the
left vertical, but are distinct because of their distinct
lowercase forms.
SMALL I WITH DOT ABOVE follows the standard rules for combining
characters after SMALL I: the native dot of SMALL I is dropped.
Glyphically it looks like SMALL I, but its decomposition structure
is different.
Now we get the following casing behaviors:
Non-Turkish "i": SMALL I
Non-Turkish "I": CAPITAL I
Turkish dotless "i": SMALL DOTLESS I
Turkish dotless "I": CAPITAL DOTLESS I
Turkish dotted "i": SMALL I WITH DOT ABOVE (precomposed or decomposed)
Turkish dotted "I": CAPITAL I WITH DOT ABOVE (ditto)
Comments?
-- John Cowan cowan@ccil.org Please do not use "Reply" e'osai ko sarji la lojban. ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT