Jonathan Coxhead scripsit:
> I also believe that the case mapping in Turkish is exactly the same as
> anywhere else in the world---it's just that a Turkish uppercase i is
> displayed with a dot. They also have 2 more characters, lowercase dotless i
> and uppercase dotless i, which are *not* displayed with a dot.
This description, while defensible (and I have defended it several
times), breaks down on practical grounds. The fact is that there
is too much data coded in 8859-9 (with 0xDD = LATIN CAPITAL
LETTER I WITH DOT and 0xFD = LATIN SMALL LETTER DOTLESS I) which
contains both Turkish and non-Turkish text. Transcoding this
data to Unicode would be intolerably difficult if it all had to
be tagged first to sort out which 0x49 characters are ordinary
"I" and which are CAPITAL LETTER DOTLESS I. Better to accept
the compromise and get on with moving to Unicode. *sigh*
(Similar things were done with the Thai block, which is essentially
a transcription of TIS 620, even though that standard is
highly non-Unicode in flavor.)
-- John Cowan http://www.ccil.org/~cowan cowan@ccil.org You tollerday donsk? N. You tolkatiff scowegian? Nn. You spigotty anglease? Nnn. You phonio saxo? Nnnn. Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT