Re: [OT] ASCII support in C/C++ (was: doubt)

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Jan 11 2004 - 17:07:26 EST

  • Next message: Clark Cox: "Re: [OT] ASCII support in C/C++ (was: doubt)"

    ----- Original Message -----
    From: "Hallvard B Furuseth" <h.b.furuseth@usit.uio.no>
    To: "Philippe Verdy" <verdy_p@wanadoo.fr>
    Cc: "Clark Cox" <clarkcox3@mac.com>; "Unicode Mailing List"
    <unicode@unicode.org>
    Sent: Sunday, January 11, 2004 8:18 PM
    Subject: Re: [OT] ASCII support in C/C++ (was: doubt)

    > Philippe Verdy writes:
    > >From: "Clark Cox" <clarkcox3@mac.com>
    > >> Actually, both the C and C++ standards require that the char type be
    > >> at least 8-bits. that is, the signed char type must be able to
    > >> represent the values in the range [-127, 127], and the unsigned char
    > >> type must be able to represent the values in the range [0, 255]. Any C
    > >> or C++ compiler that cannot meet those requirements is non-conformant.
    > >
    > > Yes of course (however this depends on which standard you discuss
    here...
    >
    > No, it doesn't.
    >
    > > The language itself does not require it, just the implementation
    guidelines
    > > for applications on generic OS.
    >
    > The C and C++ languages are defined by the C and C++ standards. As
    > Clark says, the standards do require this. See for example ISO C
    > section 5.2.4.2.1 (Sizes of integer types <limits.h>).
    >
    > > If you look at some C compilers created for microcontrolers or hardware
    > > devices, you'll see that it supports the full core language,
    >
    > If it does, it has 8-bit 'char' or wider. Otherwise it is not a C
    > compiler, however much it might claim to be. It is a compiler for a
    > language _ressembling_ C.

    All this relates to the language that was standardized very lately by ISO
    and initially by ANSI (in collaboration with the initial designers Kernighan
    and Richie who designed the language to write Unix). There are still a lot
    of code needing support of the K&R C language, which is a de-facto (rather
    than de-jure) standard, as it was specified in the first edition of "the C
    language" by Brian Kernighan & Dennis Richie (Prentice-Hall, 1978) and
    translated into languages (1983 for the French edition) .

    There are still a lot of systems which ONLY support a K&R C compliant
    compiler (without "void", "signed char", "long long", and function
    prototypes) but not the ANSI C american standard, or the late ISO C
    standard. And most of these systems do not have all what is required to
    support POSIX. And lots of other C++ compilers that were written and used on
    systems long before the ISO C standard was published, and still not
    implementing the full ANSI C standard.

    Not all platforms are supporting fully IEEE-compliant floatting point
    operations as well (because there's no FPU and fully implementing it by
    software would impact too much performance). So the POSIX and ISO C
    requirements cannot be applied to these systems. Note that even on PC
    systems, the FPU is not always fully IEEE-compliant, and deficiencies are
    supported by the mathematical libraries, or by the underlying OS if it can
    "patch" the code on the fly by modifying the way some instructions will be
    computed through emulation.

    Look at the initial question in this list by "Deepak Chand Rathore"
    yesterday: it's widely open, and the question is about how any C compiler
    could affect the supposed complete support of ASCII in various platforms in
    their default working locale (not all environments have the support for
    multiple locales, only a default locale is supposed to be present, but this
    default locale is not necessarily mapped to mean "ASCII supported" and "US
    English" as it is in POSIX systems which define the "C" locale.)

    So the question is related to portability. Portability is possible only on
    platforms supporting at least the same minimum standard. This affects the
    way a software is written to handle characters and strings. Adapting the
    software for other previous versions of the standard or even to the widely
    deployed K&R 1978 de-facto standard is not a stupid question. The question
    is then to know, before writing the software, what kind of problems can be
    expected for the representation of datatypes across various systems one wish
    to support with the C-written software.

    It would be fun if all systems really had an available compiler that support
    the minimum standard needed to compile the C source. But too many C programs
    do not simply specify to which C standard (de-jure like ANSI C or ISO C, or
    de-facto like K&R 1978) the code was written for. Many programmers just say
    "it's C language", but in fact there are really several C languages, one per
    specification (I include the K&R C 1978 version as a plain language with an
    effective specification).



    This archive was generated by hypermail 2.1.5 : Sun Jan 11 2004 - 17:50:33 EST