From: Kenneth Whistler (kenw@sybase.com)
Date: Thu Jan 25 2007 - 16:50:17 CST
Jukka said:
> The identity of U+00BA and U+00AA is somewhat vague. Their names suggest
> very specific usage. If they are meant for more general use, I think notes
> about this should be added, perhaps to the code chart, perhaps to the list
> of misleading character names. If, on the other hand, their intended usage
> is limited as suggested by their names, I think this should be mentioned
> too, in the standard.
Don't lose sight of the fact that U+0020..U+007F, U+00A0..U+00FF
can be considered ISO/IEC 8859-1 compatibility characters.
The identity of U+00BA and U+00AA is *exactly* what they were/are
in Latin-1, because that is where they came from. That is also
where the *names* came from.
ISO/IEC 8859-1 says precisely zilch about U+00BA and U+00AA (and
never has said anything about them), other than what you could
imply by the stated intent for language coverage, to include
Spanish and Portuguese.
So in practice, U+00BA and U+00AA are whatever 0xBA and 0xAA
in ISO/IEC 8859-1 and the corresponding code points in
Windows 1252 (the two most widely implemented 8-bit character
encodings) have been used for for 20-some years now. (And if
you want, you can dig into the history of character encoding
to find the pre-existing character encodings that 8859-1 itself
got them from.)
So unless folks are all hung up and confused about what 0xBA and
0xAA are used for in the continuing widespread usage of
8859-1 and Windows 1252, I don't see any great need for
annotating them further in the Unicode Standard.
I suppose the issue is that there are so many *other* things
to contrast them with in the Unicode Standard. So to get
everything on the table, folks need to be considering not
only:
00AA;FEMININE ORDINAL INDICATOR;Ll;0;L;<super> 0061;;;;N;;;;;
00BA;MASCULINE ORDINAL INDICATOR;Ll;0;L;<super> 006F;;;;N;;;;;
but also:
1D43;MODIFIER LETTER SMALL A;Lm;0;L;<super> 0061;;;;N;;;;;
1D52;MODIFIER LETTER SMALL O;Lm;0;L;<super> 006F;;;;N;;;;;
If you are writing traditional Spanish and Portuguese
abbreviations, you'd use the same characters that you'd write
the same text with in 8859-1, namely U+00AA and U+00BA.
If you are writing UPA transcriptions, with superscript
modifier letters, and you don't want to be at the mercy of
font design for what style of "a" is used or whether the
"a" or "o" is displayed with underscores, etc., then you
would use U+1D43 and U+1D52.
--Ken
This archive was generated by hypermail 2.1.5 : Thu Jan 25 2007 - 16:51:18 CST