Re: Subfield mark in MARC records -- take 2

From: John Cowan (cowan@locke.ccil.org)
Date: Tue Mar 31 1998 - 09:58:52 EST


John Clews wrote:

> All C0 control characters (previously in ISO/IEC 646, and now from
> ISO/IEC 6429) have glyphs in UCS (ISO/IEC 10646 and Unicode) as this
> excerpt from the draft sorting standard ISO/IEC FCD 14651 shows
> below. Those involved in these discussions may find it useful to note
> these codes, and check the glyph in UCS (ISO/IEC 10646 and Unicode).

To be absolutely pernickety about it (a useful phrase I acquired from
Michael Everson), the Unicode characters U+2400-U+241F (C0 set),
U+2421 (DEL), and U+2423 (NL) don't actually have glyphs associated with
them: which glyph to use is purely implementation-dependent.
See TUS2.0 p. 6-84:

        [O]nly the semantic is encoded in the Unicode
        Standard. This allows a particular application
        to use the graphic representation it prefers. [...]

        The [...] code points in this block are not associated
        with specific glyphs, but rather are available to
        encode *any* desired pictorial representation of the
        given control code. The assumption is that the
        particular pictures used to represent control codes
        are often specific to different systems, and are not
        often the subject of text interchange between systems.

> In my view, the question of which glyph is actually used to represent
> U+001F is up to the implementor - much as it has been in most
> bibliographic systems that use only 8-bit character sets.

This is consistent with always transforming U+001F to U+241F on
output, and expecting the local font system to provide a suitable
glyph.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (FW 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT