From: Andrew S (asunic@mail.ru)
Date: Thu Oct 27 2005 - 04:23:17 CST
Kenneth Whistler wrote:
> The problem isn't that existing software would break, but rather
> that it would be then gradually forced (and inconsistently and
> asynchronously) to deal with the addition of these 6 digits
> that behave differently than all those processes are currently
> handling hexadecimal expressions. Most software simply wouldn't
> change, but you would have opened the dike to the drip, drip,
> drip of people wanting to use the new digits because they
> "fix" hexadecimal numbers, and filing bugs and badgering
> customer support because your software doesn't "support"
> Unicode correctly.
In general this is an argument against Unicode itself. E.g. what about the new apostrophes, dashes, slashes, etc that might or might not look identical to the preexisting ascii characters? Do they open the dike to the drip, drip of people who want to be able to use e.g. minus sign instead of hyphen-minus as the minus symbol in their C programs and complaining when the compiler doesn't support it?
> It doesn't make any sense for hexadecimal digits, if they are
> really *numbers*, not letters, to have case pairs.
True. And the proposal does recommend against introducing case pairs for the new digits.
> Now let's say I want to represent the number 43,852 using the
> new characters. That would end up being "@#4$", and wouldn't
> require the "markup" of "0x" mentioned by the OP, because it
> contains only digits, and no letters. (Actually, not even
> that is correct, because in principle it could also be a
> radix 13, 14, or 15 number, as well as a radix 16 number,
> but that aside. ... )
I did acknowledge in my original post that the proposed new digits would serve only to distinguish letters from digits, and that in general numbers would still have to be marked up in order to distinguish radices.
> The issue now is that I have a
> formatting and display problem that I didn't have before, because
> I need to be able to display "@#4$" as either "AB4C" or
> "ab4c", depending on style.
The proposal is that the number be displayed neither as "AB4C" nor as "ab4c", but as "@#4$". The proposal is for only one display style, not two, for the same reason that U+0030 through U+0039 have only one display style. The new digits are proposed to have their own glyphs rather than be converted to Latin letters prior to display, for the same reason that "0" has its own glyph and isn't converted to "O" prior to display. The proposal cites as one advantage that the digits can then be properly always monospace even in variable width fonts, just like U+0030 through U+0039 are.
> Either I artificially introduce
> *another* casing distinction into my brand spanking new
> hexadecimal digit characters, or I have introduced a *new*
> style markup problem into my hexadecimal digit display that
> I didn't have before.
That's assuming that you intend to preserve the option of two display styles even for the new digits. The proposal is that that option not be preserved (for the new digits), since it's simply an unnecessary artifact of the use of Latin letters as digits, in the same way that the option of using "O" or "o" for zero might presently be an unnecessary artifact if there historically were no dedicated "0" digit.
> And on and on... I haven't even started on the apoplectic
> fits that would be thrown by security people were Unicode
> to introduce identical-looking clones for 6 ASCII letters,
> claiming that they were *only* hexadecimal digits.
Doesn't this also apply to the new minus sign?
> What we had here was essentially a case of well-intentioned
> but ill-advised systematizing by a rather eccentric proposal
> writer, without a clue as to what the actual impact would
> be on existing systems were anybody to actually attempt to
> support it in any way.
The proposal writer did demonstrate possession of a clue with regards to interaction with contemporary standards by recommending that the existing Latin letters be compatibility characters (I'm not sure whether this is the correct terminology) for the new hex digits, correctly identifying various costs of disunification, etc.
The proposal does suffer from various minor grammatical and spelling errors, but it's apparent from the writer's name and email address (and stated place of birth) that his native language probably isn't English, and in any case the proposal is rationally argued and properly presented, and fully and relevantly answers all of the questions on the proposal form.
I don't know in what sense the writer is "eccentric", as the proposal in question is the only proposal (and in fact the only writing at all) of his which I've seen, and I don't see exhibition in it any irrationality or anything that I would consider eccentricity.
> Furthermore, it was completely
> unmotivated, because it failed to demonstrate that anybody
> is actually suffering in the handling of hexadecimal numeric
> expressions encoded as they currently are -- and have been
> for decades.
The proposal writer did cite the present problem of unaligned hexadecimal number display in variable width fonts, and even included screenshots of Microsoft programs actually suffering from this problem. Probably this particular problem is not widely considered to be significant, but it is a valid motivation cited by the proposal writer.
In any case your argument also applies to the use of hyphen-minus as a minus symbol.
This archive was generated by hypermail 2.1.5 : Thu Oct 27 2005 - 04:24:11 CST