From: Hans Aberg (haberg@math.su.se)
Date: Sat Feb 19 2005 - 12:37:22 CST
At 12:39 -0800 2005/02/18, D. Starner wrote:
>> > That's not invariable; there are rare cases where capitalization alter
>> > semantics. For example, Poles and poles are two different things. More
>> > importantly, capitalization alters the meaning of the sentence and
paragraph.
>>
>> It depends: If people cannot parse the sentence "he was a pole", or the
>> sentence in all-caps "HE WAS A POLE".
>
>Try "the pole fell down." or "THE POLE FELL DOWN." versus "The Pole fell down."
>In any case, it doesn't matter; people can frequently parse sentences without
>vowels, ro wth the wordsmisspelled. That doesn't mean that those elements don't
>matter.
The difference would probably be between whether it is a convention or must.
If it is the latter, one must have it as a character, but if it is the
former, the need of having the character becomes more fuzzy.
But one can add characters so that one can express the differences:
<english><begin><sentence>the pole fell down<end><sentence>
<english><begin><sentence>the <proper noun>pole fell down<end><sentence>
<english><begin><sentence><caps>the pole fell down<end><sentence>
<english><begin><sentence><caps>the <proper noun>pole fell
down<end><sentence>
Then a renderer can compute the correct output, because the correct semantic
information has been supplied. In standard rendering, there will be no
difference between the output of the third and fourth sentences above. But a
renderer could still produce a difference, without having to resort to
complicated language analysis.
>> There seems to be at two principles involved: The semantic and the graphic.
>> If drawn to its logical end, one should perhaps have at least two character
>> sets, one for the correct semantic representation, and another, for enabling
>> a correct graphic representation.
>
>I think it more accurately reveals that your analysis isn't useful. When
>you start coming up with a lot of extra complexity, you're doing it wrong.
As for a character set as Unicode, which seems to be mixuing several
principles, the approach would be useful by first decomposition (i.e.,
analyzing) the different principle components, and the weighting them out
(i.e., synthesizing).
For the original question, a capital German ß (double s), one question is if
it is used ino order to indicate semantics. Probably not. The second
question is if the is a glyph difference, if it would become available.
Perhaps it might. The it might be added to Unicode based on the second
principle. ALternatively, people may feel that a fully cpitalized word is
different from one which is not. The the first pricniple calles fot it to be
added. There might a third pricniple involved, namely one wants to enable
the writers intention that it is an all caps word. Then that might be used
as a princple for adding the character. I am her not epxressing any opinion
whther it should be added, only inidicating a way to analyze it.
>Heading down your direction would involve replacing all the words with
logograms;
>after all, honor versus honour versus honur really doesn't matter.
And then, in addition, getting people used to write semantically correct
texts.
>Why identify
>pH as mathematical and chemical letters, missing the real semantic meaning, in
>exchange for etymological analysis of questionable real world accuracy?
One reason is that the rendering might be different. Math letters are
rendered according to principles quite different from natural language
words. Another, more modern, reason might be to enable a search engine to
find the correct words, excluding the wrong ones. But in the end, the most
important deciding factor is whether people will get used to typing
semantically correct input, if given a chance. One can also note that
semantically correct input might simplify the typing, if the renderer is
doing work that the typist now must do.
Hans Aberg
This archive was generated by hypermail 2.1.5 : Sat Feb 19 2005 - 13:33:50 CST