From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Jun 27 2010 - 03:33:36 CDT
All the previous things about ISO 8859 is true, but if the Euro symbol
had the success it has (and it works remarkably well) is that Windows
is used on a lot of PCs :
Microsoft modified its all its Windows code pages (unformally named
"ANSI" due to the name of legacy Win16 APIs which were also ported to
Win32) used in Europe to include the Euro symbol in position 0x80
(which was not used in those code pages).
There are still unused positions in Windows codepages, but most of
them were built on top of ISO 8859, by dropping all C1 controls (not
needed for Windows and not even for DOS compatibility), freeing 16
positions for some commonly used punctuation signs, then the euro.
Microsoft could still decide to repeat it for the codepages used in
India. But even there, Windows display the Indic scripts using Unicode
(and not the ISCII standard).
Microsoft will certaily modify its mapping to Unicode for supporting
the ISCII standard, if it allocates a position there, and other
vendors will follow as well.
When the Euro was added, there was no real need to modify the 8859
pages and this was not done. Microsoft decided to modify its European
Windows "ANSI" codepages only because at that time, it was still
supporting older systems that needed a compatibility with DOS, and
where Unicode was still not used internally in the system (notably
Win16 and Win32s systems like Windows 3.1x and Windows 95/98/ME that
still did not really use a true Unicode-enabled kernel, and did not
even support the NTFS filesystem used on NT and the newer Windows
2000).
IBM also had to adapt its many codepages used on various systems (but
these systems were already becoming very marginalized). This caused
lots of havoc (including also because there were so many variants of
EBCDIC...)
Apple decided to follow a direction completely opposite to IBM, to not
change anything, given that its legacy Mac codepages were already
deprecating (Apple adopted the OS-level use of Unicode probably much
faster than Microsoft, the latter initially reserved it only for its
"professional" NT systems when the former had already decided to stop
maintaining or adding new 8-bit codepages).
But for the Indian Rupiah, there's no need to change anything : all
systems needed for India are already Unicode-enabled (and older
ISCII-based systems are now almost all extinct, so I doubt that there
will even exist any need to change it : these systems will continue to
use the existing usual abbreviations). The Indian government just has
to sponsor its encoding in Unicode.
Let's not repeat the IBM tragedy... India certainly has better places
to put its public (and private) money in, than for reviving and
adapting old and dying national 8-bit encoding standards (that will
still terminate their life without the new symbol addition if they
don't support Unicode).
Today the world is connected to Internet for almost everything, and
the Internet uses Unicode more than all other encodings combined.
Philippe.
"Erkki I Kolehmainen" <eik@iki.fi> wrote:
> At the time I was the European project team leader for the standardization
> of the euro, and as such I was strongly pushing for the addition of the euro
> sign to Latin-1, which could not be done without adding a new part, which
> then had to be done for the visibility. I fully agree with Ken (as he quite
> well knows, I trust) that no new character encoding standardization should
> have been done for quite a while on anything but the 10646/Unicode. As is,
> the use of any of the 8859 parts can no longer be really be justified for
> any purpose, and with 10646/Unicode the euro sign works extremely reliably.
>
> Sincerely, Erkki
> ----
> Kenneth Whistler wrote:
> > On Fri, 25 Jun 2010, I wrote
> >
> > > Even in the year 2010, the euro sign (¤) doesn't work reliably.
> >
> > in both the Unicode list and in the newsgroup de.test.
> >
> > unicode.org shows a euro sign:
> > http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0372.html
> >
> > groups.google.com shows a currency sign:
> > http://groups.google.co.uk/group/de.test/msg/e027e91e7ef17f62
>
> And as the snark seems to be spreading about this, let's step
> into the Wayback Machine for a moment...
>
> When 8859-15 was originally proposed in 1997 (see SC2/WG3 N388R, for
> those of you with deep document archives), primarily to add the euro
> sign to an 8-bit character set (but also to "fix" 8859-1 for
> French and Finnish), the U.S. NB voted against the subdivision
> of work, claiming in the strongest of terms that the proposal
> was inherently flawed and simply would not work to solve the
> problem(s) it was addressed at.
>
> I'll quote at length from the U.S. NB comments in SC2 N2994,
> dated 1997-11-21, "Summary of Voting on SC 2 N 2910, Proposal for
> Project Subdivision of project JTC 1.02.20: a new part of ISO/IEC
> 8859 for Latin Zero covering the EURO Symbol and Full Support for
> the French and Finnish Language":
>
> ================================================================
>
> The US disapproves a project subdivision for ISO/IEC 8859-15 for
> the following reasons:
>
> 1) It is the US long stated position that additional parts of
> 8859 should not be created, except to capture existing 8-bit
> practice (viz Part 11). Rather than addressing problems with
> particular solutions, which are extremely costly to implement,
> industry efforts should be focused on implementing
> comprehensive solutions via the support of ISO/IEC 10646.
>
> 2) From document WG3 N 388 it is clear that the intent is to
> replace ISO 8859-1, for the same user community. Because of
> the prominent role that 8859-1 has gained as the default
> character set in many internet protocols, introducing a near
> equivalent standard will have disastrous effects. Due to their
> large intersection part 1 and part 15 would appear to inter-operate
> without proper adherence to announcing mechanisms. Were part 15
> accepted and widely implemented, the result would be that no one
> could be sure that ANY character from the non-intersecting part of
> each set can be used reliably. In many ways, this situation is
> reminiscent of the problems that plagued the 7-bit sets of ISO 646.
>
> 3) The adoption of ISO/IEC 10646 by the vendor community is
> making rapid progress, therefore it cannot be argued that a
> flawed solution must be accepted for lack of practical
> alternatives.
>
> ================================================================
>
> It was already clear 13 years ago that 8859-15 wasn't going
> to work. It shouldn't be too surprising that 13 years later
> it still isn't working.
>
> As Mark indicated, the answer here is not to expect distributed
> systems to be able to reliably distinguish 8859-1 and 8859-15,
> when neither labelling nor heuristics for distinguishing them
> are reliable in the first place. The answer for reliable
> representation of the euro sign is to use UTF-8. And that answer
> was already obvious in 1997.
This archive was generated by hypermail 2.1.5 : Sun Jun 27 2010 - 03:40:16 CDT