Re: diaeresis/umlaut

From: John Cowan (cowan@locke.ccil.org)
Date: Sat Jun 12 1999 - 21:20:47 EDT

Next message: John Cowan: "FYI: J.R.R. Tolkien on the importance of l14n"
Previous message: John Cowan: "Re: diaeresis/umlaut"
Maybe in reply to: Figge, Donald: "diaeresis/umlaut"
Next in thread: Asmus Freytag: "Re: diaeresis/umlaut"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> >I think there are other non-spacing characters (diacritics) that have the same
> > Unicode character code value but different meanings in different scripts. And
> > like Mr. Figge I begin to wonder why these two meanings are not treated
> > differently, like Latin A, Greek Alpha and Cyrillic A have different code
> > values. Maybe someone can clarify this.
>
> I believe the main reason that these were kept separate is for round-trip
> convertibility with existing standards.

There's another reason: the search problem. If you search a multilingual
document for "ABC" you do not want Cyrillic A-Ve-Es being found too.
It's quite bad enough that Fullwidth-A-B-C will be missed by a naive
search algorithm, but at least those are compatibility equivalents.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

Next message: John Cowan: "FYI: J.R.R. Tolkien on the importance of l14n"
Previous message: John Cowan: "Re: diaeresis/umlaut"
Maybe in reply to: Figge, Donald: "diaeresis/umlaut"
Next in thread: Asmus Freytag: "Re: diaeresis/umlaut"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT