Re: Which languages are supported in basic latin

From: Antoine Leca (Antoine.Leca@renault.fr)
Date: Thu Aug 10 2000 - 12:13:50 EDT


Halldor G. Gestsson wrote:
>
> Can I find a list where all languages supported in the basic latin
> (0x0000-0x00FF)?
> E.g:
> Basic Latin support
> English
> German
> Spanish
> French
> Danish
> Icelandic
> ... etc. etc.

What is your definition of "language" ?

For example, do you mean French as printed with any usual typewriter
(so U+0000 - U+00FF qualifies), or do you believe that it should have
all the characters needed to properly write the 500 most used French
words, in which case, you need to add U+0153, œ. (U+0152 and U+0178
are also seen, but much rarer).

Also punctuation may need to be considered. First the apostrophe,
which Unicode prefers to see encoded as U+2019 (IIRC); the same
character is also used in Spain to note the decimal separator.
Next, the German quotes, or French (among others) list hyphens
(em-dash). &c.

On the other hand, are you interested to know about "javanais",
which is a form of French Parisian slang used (among others) in
the 50's and then again in the 75's, and which consists of
systematically adding "av" between the consonnant(s) and the
vowel (so "jamais" becomes "javamavais", etc.) Since in Parisian
French "œ" is more or less the same as "eu", and since the
orthography of "javanais" is not fixed, it may well qualify...
;-)

On a similar point, Russians pionneers had designed a system
which allowed them to read and write Cyrillic with ASCII-only
terminals, using KOI-7 as character set, which features a rough
translitteration scheme (the passing from Latin to Cyrillic is
noted by inversion of the case, so when it reads as upper-case,
it is in fact lower-case Cyrillic). Does Russian qualifies?

I believe that the whole Unicode effort is here to show us that
restricted character set are *not* a solution to write a given
language, at least not without some restriction upon its use
(which you were not talking about). Particularly with a so much
restricted character set as iso-8859-1.

Antoine



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT