John Cowan <cowan@locke.ccil.org> wrote that he will be posting statistics
on the use of accented letters in Library of Congress cataloging data in
the USMARC format, including
> character counts, original EBCDIC forms, and Unicode translations.
Unicode mappings for the USMARC character sets are posted at the Library of
Congress' Web site as
> USMARC to UNIVERSAL CHARACTER SET MAPPINGS
http://lcweb.loc.gov/marc/marc2ucs.html
Note that mappings for some character sets (e.g., Arabic) are not yet
finalized, as indicated by the note "See Appendix 2 for Options".
(For location of Appendix 2, see below).
The coding in the USMARC character sets is based on ASCII. EBDIC values
for the upper range of Latin script characters (ANSEL) vary between library
systems using EBCDIC. The information that John Cowan will be posting
describes the Library of Congress practice.
-- Joan Aliprand
Research Libraries Group
Member, Subcommittee on Character Sets, MARBI Committee, American
Library Association
For Appendix 2, see these documents:
97-10: Use of the Universal Code Character Set in USMARC Records (Document
97-10: Use of the Universal Code Character Set in USMARC Records (Status)
available from:
gopher://marvel.loc.gov/11/services/usmarc/marbipro/marbipro_1997
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT