On 1997-07-18 John Cowan <cowan@ccil.org> wrote:
> When preparing a comparison table of the various 8859-x parts,
> I noticed an odd property of partial consistency between 8859-1-2-3-4.
> Any character encoded by more than one coded character set is
> always encoded at the same codepoint. (8859-9 does not have
> this property; its Turkish letters don't agree with the 8859-3
> encoding.)
>
> [examples]
>
> What I'm wondering is whether this property was carefully designed
> into 8859-1-2-3-4 when they were specified, or whether it is more or
> less an accident of copying.
>
> Does anyone know?
I don't know for certain, but I don't think it's accidental. Consider these
correspondences between ASCII and 8859-1:
32 160 Non-breaking space
33 ! 161 Inverted exclamation
35 # 163 Pound sterling
36 $ 164 General currency sign
45 - 173 Soft hyphen
48 0 176 Degree sign
50 2 178 Superscript two
51 3 179 Superscript three
63 ? 191 Inverted question mark
(from http://www.pemberley.com/janeinfo/latin1.html)
(My Telnet program does not cut-and-paste 8859-1 properly, so I've deleted
the right-hand column.)
Notice by the way that the number sign and pound sterling sign share the
lower 7 bits. This is probably why I often see amounts of money written as
#19.95 here in the UK (electronically, that is). I think this is a more likely
explanation than that they share the word "pound", although this probably also
has some effect.
<sigh>
-- Daniel B
e'ocai ko sarji la lojban
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT