From: Keutgen, Walter (walter.keutgen@be.unisys.com)
Date: Fri Mar 03 2006 - 13:04:06 CST
Elaine,
in the character set standards one tends to say that a 'characters set' is just the list of characters without numbering or encoding them. Associating numbers or computer bit patterns to them is making an 'encoded character' set of it. This allows to say that Unicode characters set and GB18030 character sets are the same or that ISO-8859-x is a subset of Unicode. With the 'codes' attached to the 'characters' this would be impossible. An old example of a coded characters set is the 'Morse alphabet' associating a combination of 2 durations to each latin Letter (SOS = ... --- ...).
'Letters' are only a minority of 'characters'. The world existed long before computers. Before they invaded printing, 'printers' were people who made matrices full of characters and printed books and newspapers and so on by that method. In an argument about Unicode, I read that for fine printing in ENGLISH one needs about 400 characters. Maybe the English term used by printers for characters is 'types' cfr. 'typewriter'. Maybe the 'printers' were 'typesetters'? Were there perhaps 'type sets' in the past?
In every day IT life one does not make that subtle difference i.e. characters must be encoded and one feels no need to add the adjective 'encoded'. IT folks added also 'control characters' i.e. #7 means 'ring the bell'.
I fear that the 'Fachwörterliste', a glossary, rather is a thread needed by the specialists themselves to be sure that they understand each other, hence the reason to have such a list per project.
A French Google search in French language, yielded that there are more then 6 times as many documents with the plural 'jeu de caractères' (including Wikipedia) than with the singular.
Walter
THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of E. Keown
Sent: vendredi, le 3 mars 2006 18:11
To: unicode@unicode.org
Subject: (no subject)
March 2006
Hi,
Below, my first definition of a term one MUST know to
understand character set work. Feel very free to
critique this. This definition is for non-geeks or,
at best, semi-geeks.
Character set, a definition :
A character set is a computerized version
of any alphabet (or other writing system).
Each letter, number, symbol, etc. of the
computerized alphabet is assigned a unique
number for the computer to use in software.
There must be 15 core terms needed for a
mini-dictionary for character set work. But which 15?
Marc Kuester of DIN told me that German-language
proposals include what he calls a "Fachwörterliste," a
list of terminology to harmonize usage in all German
technical documents. Great idea!
Translations of the word character set:
le jeu de caractère (le pluriel prefere?)
מערכת
תווים
Zeichensatz
Codifica dei caratteri
PLEASE send me more translations if you have them!
As you know, the Hebrew language has been written for
3,150 years, at least. There are four living languages
which were written for over 2,900 years:
Aramaic
Chinese
Greek
Hebrew
Part of what happened with computerizing Hebrew is
that no academic Semitist knew the phrase 'character
set' until maybe 1999.
There are at least a dozen scholarly societies which
concern themselves with Hebrew. But only 1-2 of these
societies have any computational work.
Elaine Keown
in white bread America
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
This archive was generated by hypermail 2.1.5 : Fri Mar 03 2006 - 13:08:32 CST