Re: Character found in national standard not defined in Unicode?

From: George W Gerrity (g.gerrity@gwg-associates.com.au)
Date: Fri Apr 25 2008 - 10:30:29 CDT

Next message: Philippe Verdy: "RE: Character found in national standard not defined in Unicode?"

Previous message: JFC Morfin: "Re: Character found in national standard not defined in Unicode?"
In reply to: JFC Morfin: "Re: Character found in national standard not defined in Unicode?"
Next in thread: David Starner: "Re: Character found in national standard not defined in Unicode?"
Reply: David Starner: "Re: Character found in national standard not defined in Unicode?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 2008-04-25, at 19:47, JFC Morfin wrote:

> At 02:41 25/04/2008, Asmus Freytag wrote:
>> If the character doesn't violate a principle in the standard,
>> there's no reason why it couldn't be encoded; however, if its
>> presence in the standard is not correlated with it showing up in
>> actua documents (for example, because of the way systems and fonts
>> have implemented the standard) then there's perhaps no need to
>> encode the character based on its presence in a code chart.
>>
>> On the other hand, perhaps the standard did base the design on a
>> real character. If sufficient information can be assembled to
>> define that character, it would open up an avenue to encode it,
>> which would be independent of the character.
>
> This is the problem I already reported of the difference we
> encounter between norms and standard concepts. In French language
> we initially emphasisied the norm reporting the world, and in
> English they emphasised the standard ruling the world. The
> globalization problem makes that norms and standards are no more
> locally interoperable the standard influencing the way the world is
> and its normative description, but that norm is global and standard
> is local.

To people writing specifications for Programming Languages, the
difficulty of specifying meaning (or correct behaviour) in a Natural
Language is well known. That is why specifications for newer
Programming Languages are written in a meta-language, whose semantics
and syntax is defined abstractly and Mathematically. SGML is powerful
enough to act as a meta-language, and that is one reason why it has
been used as a specification language for HTML, XTML, and XML: there
are others.

While you may have a valid argument that French is a more precise
language to use for a Standard Specification, so that it may more
precisely represent the Norm (ie, the understanding or semantics of a
specification) the idea of multi-linguistics (or is it multi-
lateralisation?) is doomed to failure. The Semantics of different
natural languages simply do not overlap completely: every natural
language is capable of expressing ideas not expressible in some other
language without reference to the culture and environment upon which
it is grounded. The subtleties in one language not mappable into
another language simply do not exist for the target language: that is
always the dilemma of translators, and especially when the work being
translated is a cultural object such as a novel or a religious text.

For instance, colour names in most languages simply do not map well,
even between Indo-European Languages. Try to map Slavic or Greek
colour words into English or French. We get around this in Scientific
circles by using Psycho-Physical Language, in which colour is defined
by physical measurements of colours whose difference can be perceived
by normally-sighted humans. We even steal the Greek word for dark
blue (Green?), κυανος, to name the blue-green colour we
perceive “cyan”. The linguistic defect is even more obvious when
trying to agree on colour with a person with Red-Green Confusion
Colour Blindness. Thus, the best we can hope for in defining a
standard is to be as precise as possible in conveying the normative
meaning in whatever language the standard is written. For the cases
where exact semantics are required, we must provide an algorithmic
definition, preferably in a well-understood algorithmic language. If
no universe of discourse is available to specify meaning, then there
is not even any way to prove that meaning is possible to assign to a
concept: it is a thought that cannot be uttered.

I repeat: the problem of assigning semantics to a grammatical
structure is a well-known one and one that in the case of semantic
mapping between natural languages is known to be insoluble. That is
why we try to use algorithmic or artificially-defined meta languages
in parts of standards where natural language is too imprecise to
specify semantics. That has always been the approach in the Unicode
Standard. If the semantics is too subtle to express in an artificial
language, then it is most certainly impossible to express in every
natural language.

As a postscript, I find your arguments and statements confusing,
especially when you suggest that maybe Chinese is a better language
for writing standards. My knowledge of Mandarin is pretty shallow,
but in fact the spoken language is extremely vague in specifying
classification, using the same sounds and tones for multiple
meanings, so that modern Mandarin uses compound sounds to clarify
meaning, although the written characters can be more precise A simple
example is the words for he and she: same sound, different
characters. However, even the written language has perceived
weaknesses (to those who speak Indo-European languages) because of
its complete lack of inflection for both verb and noun forms, and
indeed, the lack of distinction between verbs and nouns. We now
perceive similar problems in Modern English because we have dropped
most of our inflection apparatus. (How many meanings can you find in
the simple English Sentence “Time flies like an arrow”?) Note that
I am not stating that Mandarin is incapable of great subtlety of
meaning, but rather that, as we now have to do in Modern English, so
in Mandarin, we have to overload sentence structure and particles and
context to provide subtlety in temporal and and physical descriptions.

> This has two main consequences :
> - the alternative between standard internationalisation
> (interoperability in using the same rules) and normatic
> multilateralisation (interoperability based upon the same
> understanding).
> - metalanguage development introducing analysis and often (as you
> mention it) leading to constraints, i.e. complication, to address
> thre resulting complexity.
>
> If adding a code is subject to a metalanguage limitations (for
> example, because of the way systems and fonts have implemented the
> standard) this means that Unicode is a Standard and not a Norm. The
> ambiguity is that it is mostly understood as being both.

See above. There is never any doubt in the minds of most standards
users that they are Standards, not necessarily Norms. The intention
is that the standard be readable to those people understanding the
language in which it is written so as to convey the Norm (Semantics).
If the text of the standard fails to do so, then it is either faulty
and needs to be revised, or the possible differences in meaning are
not cogent to the universe of discourse to which the standard is
directed. The problem of which language to use for specifying a
standard is clearly pragmatic. Choose the language that the majority
of the educated peoples of the world understand either as a first or
second language. That language today is English: the question of
whether or not English is the best language to express conciseness is
at best moot, and in any case, is totally beside the point.

If then there is a need to translate the standard into another
language, then clearly it should be done by someone who speaks both
languages fluently and who is also fully conversant in the universe
of discourse to which the standard is directed. I still remember with
distain dropping a class in Scientific Russian taught by a Ukrainian
with no Scientific background whatsoever, after he translated
“Parsec” as the distance of a stellar or planetary object from the
earth when the angle of parallax under the feet of the radius of the
Earth's orbit is one arc second, or some such, rather than the
technical terminology in English the angle of parallax subtended by
the radius of the Earth's orbit.

Finally, there is the question of whether or not a translation is in
some sense equal or equivalent to the standard as written in the
language of its conception. The short answer is obviously no, since
it is usually the case that the persons proposing the original
standard, who know all the subtleties, usually do not all speak the
second language. That seems to be the position of ISO, and it is a
reasonable one. Your idea of writing a standard simultaneously in
more than one language simply isn't practical: you won't be able to
find enough multilingual people qualified and interested in preparing
such a standard.

George
------
Dr George W Gerrity Ph: +61 2 156 0286
GWG Associates Fax: +61 2 156 0286
4 Coral Place Time: +10 hours (ref GMT)
Campbell, ACT 2612 PGP RSA Public Key Fingerprint:
AUSTRALIA 73EF 318A DFF5 EB8A 6810 49AC 0763 AF07

Next message: Philippe Verdy: "RE: Character found in national standard not defined in Unicode?"
Previous message: JFC Morfin: "Re: Character found in national standard not defined in Unicode?"
In reply to: JFC Morfin: "Re: Character found in national standard not defined in Unicode?"
Next in thread: David Starner: "Re: Character found in national standard not defined in Unicode?"
Reply: David Starner: "Re: Character found in national standard not defined in Unicode?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Apr 25 2008 - 10:35:10 CDT