On Wed, Feb 13, 2002 at 07:03:40PM -0800, Asmus Freytag wrote:
> This has been attempted for some sets of latin based languages. I don't
> have a link to one of the documents that do that. Main problem is that
> many *more* characters are actually used (and used quite commonly) by users
> of these languages, than acknowledged by the makers of these lists.
What do you mean? I've done works for Project Gutenberg, and looked at a
number of books with thoughts of reducing them to ASCII. In my opinion,
Windows-1252 has every character that most English books will need,
expect for books on language (completely unpredictable repertoire) or
use math (and need mathematical characters). The exceptions usually need
Greek, or characters from Latin Extended-A. (Possibly Hebrew, for
Biblical materials.) The exceptions to that, well, DOLLAR SIGN WITH STAR
OVERSTRUCK (used in an obscure roleplaying game) isn't in Unicode, and
probably will never be.
Other languages probably fall to similar analysis, albeit with possibly
more complex language borrowings.
-- David Starner / Давид Старнэр - starner@okstate.edu Pointless website: http://dvdeug.dhis.org What we've got is a blue-light special on truth. It's the hottest thing with the youth. -- Information Society, "Peace and Love, Inc."
This archive was generated by hypermail 2.1.2 : Wed Feb 13 2002 - 22:50:55 EST