Re: Partial consistency in 8859-x?

From: Daniel B (djad022@uce.ac.uk)
Date: Fri Jul 18 1997 - 14:41:17 EDT


On 1997-07-18 John Cowan <cowan@ccil.org> wrote:

> When preparing a comparison table of the various 8859-x parts,
> I noticed an odd property of partial consistency between 8859-1-2-3-4.
> Any character encoded by more than one coded character set is
> always encoded at the same codepoint. (8859-9 does not have
> this property; its Turkish letters don't agree with the 8859-3
> encoding.)
>
> [examples]
>
> What I'm wondering is whether this property was carefully designed
> into 8859-1-2-3-4 when they were specified, or whether it is more or
> less an accident of copying.
>
> Does anyone know?

I don't know for certain, but I don't think it's accidental. Consider these
correspondences between ASCII and 8859-1:

 32 160 Non-breaking space
 33 ! 161 Inverted exclamation
 35 # 163 Pound sterling
 36 $ 164 General currency sign
 45 - 173 Soft hyphen
 48 0 176 Degree sign
 50 2 178 Superscript two
 51 3 179 Superscript three
 63 ? 191 Inverted question mark

(from http://www.pemberley.com/janeinfo/latin1.html)

(My Telnet program does not cut-and-paste 8859-1 properly, so I've deleted
the right-hand column.)

Notice by the way that the number sign and pound sterling sign share the
lower 7 bits. This is probably why I often see amounts of money written as
#19.95 here in the UK (electronically, that is). I think this is a more likely
explanation than that they share the word "pound", although this probably also
has some effect.

<sigh>

-- Daniel B
 
e'ocai ko sarji la lojban



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT