Re: French encoding [Was: Chapter on character sets]

From: Alain LaBonté  (
Date: Thu Jun 15 2000 - 14:37:03 EDT

À 09:52 2000-06-15 -0800, a écrit:
>Since Latin-1 was the
>encoding of choice prior to Latin-9 (and still is in many situations),
>there really are much more data encoded as 8859-1 than as 8859-9. To state
>that a subset of Latin-1 is not Latin 1 is incorrect

[Alain] The idea behind standardizing Latin-9 was to serve as a standard
interchange between all 8-bit-character-set-capable-only platforms so as to
support the EURO sign, and French and Finnish integrally.

EBCDIC can't support more than 191 graphic characters and therefore can't
be extended to support MS-1252 character in which most French and Finnish
PC data is encoded. This data needs to be interchanged with other platforms
actually. Even if the œ|œ (oe|OE) is not available on many keyboards it has
been generated automatically for a while in data by WinWord (with the
Canadian keyboard I type it dicrectly myself).

A standard, straightforward, interchange for this was missing. It now exists.

Most pure Latin-1-encoded data is also in practice Latin-9-encoded at once
(the differences between Latin 1 and Latin 9 is due to mostly unused and
almost unsignificant characters [standalone accents, broken bar -- the one
that can have no effect syntactically even under DOS, and a nearly unusable
-- because too limited -- subset of vulgar fractions -- most of Europe uses
the decimal system anyway, and certainly the French speakers and the Finns]).

Latin-9 was required as the missing standard link for French and Finnish
between EBCDIC, PCs, Macs, UNIX-based systems, and Unicode, to convey
significant data back and forth between all these platforms without loss.
That was the intent. It still makes sense.

Alain LaBonté
Charlesbourg (offline from now on for 48 hours at least)

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:03 EDT