Re: Hexadecimal digits

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Fri Jun 04 2010 - 15:26:45 CDT

  • Next message: Luke-Jr: "Re: Hexadecimal digits"

    "Luke-Jr" <luke@dashjr.org>
    > A : unicode@unicode.org
    > Copie à : "John Dlugosz" <JDlugosz@tradestation.com>, "Otto Stolz" <Otto.Stolz@uni-konstanz.de>
    > Objet : Re: Hexadecimal digits
    >
    > On Friday 04 June 2010 11:55:55 am John Dlugosz wrote:
    > > Those things really happen when writing in assembly language. I recall
    > > having to write "numbers" that only begin with a decimal digit, so "a
    > > fish" is a word, and "0ah fish" is a number. In C and C++, "a" is a word
    > > and "0xa" is a number.
    >
    > But I'm not talking about programming languages, just common everyday uses by
    > people who have it as their primary (not secondary) system of numbers.

    It's true that we don't need new digits for hexadecimal numbers for
    programming, given that for such technical use, we already have
    unambiguous notations using prefixes like "0x" or "$" or suffixes like
    "H".

    The real need would be is we started to count, in our natural life, in
    a binary system like hexadecimal: there would still be the need to use
    it unambiguously with decimal numbers, so that numbers written like
    "10" would still remain unambiguosuly interpreted as ten and not
    sixteen: to avoid this problem, we would also need another set of
    digits for 0-9. Or we would have to use another additonal notation
    such as some diacritic (or prefix/suffix like for programming : but
    how are we supposed to ponounce them?).

    The other major problem will be linguistic : to make the hexadecimal
    convenient, we would also need to have other names than "ten",
    "twenty", unless we keep their meaning but forbid combining them in
    sequences like "twenty one" which would still be interpreted in a
    decimal system. So we would need new names for powers of 16, even if
    we keep the names we have for 0..9 and possibly more (ten, eleven,
    twelve are possible in English, thirteen would prebably be
    disqualified as a unit name; in French we could keep dix, onze, douze,
    treize, quatorze, quinze for the hexadecimal units; all other names
    for powers of 10 and their multiples would be disqualified in the new
    naming as they would not translate easily in the hexadecimal system).

    But then people would have to remember at least the order of magnitude
    between numbers named in the decimal system and number with the same
    numeric value in the hexadecimal system (converting numbers from the
    decimal to the hexadecimal system is not trivial after som small
    range, it is even a complex algorithm to implement on computers for
    high precision numbers...

    But let's imagine that such a new naming system exists, it will come
    with its own digits ; so they would all be encoded as a complete
    separate set of the 16 digits (in 0..15 inclusively), and they would
    certinaly have their own distinctive glyphs. There won't be any need
    to change the technical notation used in computers using the existing
    decimal digits 0..9, letters A..F, and prefixes/suffixes using another
    letter or symbol.

    So my opinion is then that, if digits were added for hexadecimal
    notations, they should all be encoded for the full range 0..15, not
    just the range 10..15, and in an unbroken sequence.

    In this system, the least significant unit digit (in numbers with
    multiple hexadecimal digits) may be freely replaceable by a decimal
    digit if it's in 0..9, without creating confusion, for practical
    reasons (0 and 1 are too commonly used every day and should not
    require any conversion between the two numeric systems).

    Implementers could then imagine the distinctive glyphs associated with
    those digits, but the glyphs should remain simple, and as narrow as
    existing decimal digits, and clearly distinct from existing digits, or
    letters in all major scripts already using the existing digits
    (otherwise this system will fail immediately and will be even less
    convenient than using the technical notations like "0x..." using in
    programming. The glyphs should also be mnemonic according to their
    value (that's why I suggested a glyph shape near from the shape of
    decimal digits, possibly by using some diacritic like a top bar (which
    could connect with the bar of the surrounding digits, so that it is
    drawn really fast, but this may create confusion with some existing
    notations used in maths or engineering).

    Another solution would be to borrow for use in Latin the 16 first
    glyphs of some other alphabets in other very different script (but in
    this script to borrow the glyphs of the Latin alphabet), but this
    would require two sets of hexadecimal digits, that would have to be
    encoded separately, and it is unlikely that scripts other than Latin
    will borrow Latin letters for that usage, given that they frequently
    borrow words written in the Latin script (notably trademarks, and on
    offocial documents like passports, or when transliterating people
    names and postal addresses).

    The glyphs would also need to be as fast to draw and easy to read as
    existing digits and letters (using a single stroke, or two very simple
    strokes).

    For now, there's no such coherent set of distinct glyphs for
    hexadecimal numbers. The learning curve for using the hexadecimal
    system in everyday use is still very far from reaching an
    international agreement (imagine the tremendous consequences and
    problems for pricing, and measures...). It will be extremely costly to
    change our everyday worldwide use of the positional decimal system
    (much more difficult than changing currencies).

    But there is anyway a limited but growing use of the hexadecimal
    system, for which new digit shapes could be more convenient than using
    desambiguating prefixes/suffixes.

    But before that, we would still first need to invent and use new names
    for powers of sixteen, and a rational way to name reasonnably large
    numbers in this system (at least up to 64-bit), including for
    fractions of unity ; this has already started in the metric units used
    in the computing industry, by the adoption of binary-based prefixes
    for measure names (kibi, mebi, gibi, ...) instead of the 10-based
    prefixes (kilo, mega, giga...), and the new recommendation of
    abbreviated symbols for these prefixes for multiples/submultiples
    (appending a lowercase "i" after the initial : "Ki, "Mi, Gi..."
    instead of just "k, M, G...")

    Note that these binary-based metric prefixes have been documented now
    since long, and are wellknown now by a growing population. But their
    adoption is still far from completion, and we still hear "kilobytes"
    instead of "kibibytes" : this is a clear sign that the need is not
    urgent, and that adopting new glyphs and digits after adopting new
    quantity names for powers of 16, is still very prematured, because
    this usage is still perceived as technical and limited in scope of
    application. And for such technical use, most often in computer
    technologies, the existing computer language notation (with prefixes
    like "0x") is just enough and does not require any new set of digits.



    This archive was generated by hypermail 2.1.5 : Fri Jun 04 2010 - 15:29:52 CDT