Date: Fri Jun 04 2010 - 15:26:45 CDT

• Next message: Luke-Jr: "Re: Hexadecimal digits"

"Luke-Jr" <luke@dashjr.org>
> A : unicode@unicode.org
> Copie à : "John Dlugosz" <JDlugosz@tradestation.com>, "Otto Stolz" <Otto.Stolz@uni-konstanz.de>
> Objet : Re: Hexadecimal digits
>
> On Friday 04 June 2010 11:55:55 am John Dlugosz wrote:
> > Those things really happen when writing in assembly language. I recall
> > having to write "numbers" that only begin with a decimal digit, so "a
> > fish" is a word, and "0ah fish" is a number. In C and C++, "a" is a word
> > and "0xa" is a number.
>
> But I'm not talking about programming languages, just common everyday uses by
> people who have it as their primary (not secondary) system of numbers.

It's true that we don't need new digits for hexadecimal numbers for
programming, given that for such technical use, we already have
unambiguous notations using prefixes like "0x" or "\$" or suffixes like
"H".

The real need would be is we started to count, in our natural life, in
a binary system like hexadecimal: there would still be the need to use
it unambiguously with decimal numbers, so that numbers written like
"10" would still remain unambiguosuly interpreted as ten and not
sixteen: to avoid this problem, we would also need another set of
digits for 0-9. Or we would have to use another additonal notation
such as some diacritic (or prefix/suffix like for programming : but
how are we supposed to ponounce them?).

The other major problem will be linguistic : to make the hexadecimal
convenient, we would also need to have other names than "ten",
"twenty", unless we keep their meaning but forbid combining them in
sequences like "twenty one" which would still be interpreted in a
decimal system. So we would need new names for powers of 16, even if
we keep the names we have for 0..9 and possibly more (ten, eleven,
twelve are possible in English, thirteen would prebably be
disqualified as a unit name; in French we could keep dix, onze, douze,
treize, quatorze, quinze for the hexadecimal units; all other names
for powers of 10 and their multiples would be disqualified in the new
naming as they would not translate easily in the hexadecimal system).

But then people would have to remember at least the order of magnitude
between numbers named in the decimal system and number with the same
numeric value in the hexadecimal system (converting numbers from the
decimal to the hexadecimal system is not trivial after som small
range, it is even a complex algorithm to implement on computers for
high precision numbers...

But let's imagine that such a new naming system exists, it will come
with its own digits ; so they would all be encoded as a complete
separate set of the 16 digits (in 0..15 inclusively), and they would
certinaly have their own distinctive glyphs. There won't be any need
to change the technical notation used in computers using the existing
decimal digits 0..9, letters A..F, and prefixes/suffixes using another
letter or symbol.

So my opinion is then that, if digits were added for hexadecimal
notations, they should all be encoded for the full range 0..15, not
just the range 10..15, and in an unbroken sequence.

In this system, the least significant unit digit (in numbers with
multiple hexadecimal digits) may be freely replaceable by a decimal
digit if it's in 0..9, without creating confusion, for practical
reasons (0 and 1 are too commonly used every day and should not
require any conversion between the two numeric systems).

Implementers could then imagine the distinctive glyphs associated with
those digits, but the glyphs should remain simple, and as narrow as
existing decimal digits, and clearly distinct from existing digits, or
letters in all major scripts already using the existing digits
(otherwise this system will fail immediately and will be even less
convenient than using the technical notations like "0x..." using in
programming. The glyphs should also be mnemonic according to their
value (that's why I suggested a glyph shape near from the shape of
decimal digits, possibly by using some diacritic like a top bar (which
could connect with the bar of the surrounding digits, so that it is
drawn really fast, but this may create confusion with some existing
notations used in maths or engineering).

Another solution would be to borrow for use in Latin the 16 first
glyphs of some other alphabets in other very different script (but in
this script to borrow the glyphs of the Latin alphabet), but this
would require two sets of hexadecimal digits, that would have to be
encoded separately, and it is unlikely that scripts other than Latin
will borrow Latin letters for that usage, given that they frequently
borrow words written in the Latin script (notably trademarks, and on
offocial documents like passports, or when transliterating people

The glyphs would also need to be as fast to draw and easy to read as
existing digits and letters (using a single stroke, or two very simple
strokes).

For now, there's no such coherent set of distinct glyphs for
system in everyday use is still very far from reaching an
international agreement (imagine the tremendous consequences and
problems for pricing, and measures...). It will be extremely costly to
change our everyday worldwide use of the positional decimal system
(much more difficult than changing currencies).

But there is anyway a limited but growing use of the hexadecimal
system, for which new digit shapes could be more convenient than using
desambiguating prefixes/suffixes.

But before that, we would still first need to invent and use new names
for powers of sixteen, and a rational way to name reasonnably large
numbers in this system (at least up to 64-bit), including for
fractions of unity ; this has already started in the metric units used
in the computing industry, by the adoption of binary-based prefixes
for measure names (kibi, mebi, gibi, ...) instead of the 10-based
prefixes (kilo, mega, giga...), and the new recommendation of
abbreviated symbols for these prefixes for multiples/submultiples
(appending a lowercase "i" after the initial : "Ki, "Mi, Gi..."
instead of just "k, M, G...")

Note that these binary-based metric prefixes have been documented now
since long, and are wellknown now by a growing population. But their
adoption is still far from completion, and we still hear "kilobytes"
instead of "kibibytes" : this is a clear sign that the need is not