From: Julian Bradfield (jcb+unicode@inf.ed.ac.uk)
Date: Thu Aug 20 2009 - 05:09:42 CDT
David Starner wrote:
>On Wed, Aug 19, 2009 at 5:47 AM, Julian
>Bradfield<jcb+unicode@inf.ed.ac.uk> wrote:
>> The argument is that IPA and Greek letters are *logically
>> separate* letters, and should therefore be encoded separately, for the
>> sake of data processing on them.
>
>I don't buy it. It is trivial for an algorithm to tell that aβs is
>IPA; it's impossible for an algorithm or even a human to tell whether
>abs is IPA or plain old Latin characters. In my particular version of
>Cleanicode, I would disunify IPA from Latin, but in the real world ,
>if you're doing data processing on IPA, you've tagged it, either
>explicitly or implicitly. What data processing are you doing that aβs
>is a problem but not abs?
I'm not, yet - but I don't think anybody is doing much of the data
processing that has been appealed to by either side in this argument!
But simple editing (e.g. search and replace) is an example.
It's true that I'm unlikely to use Greek beta and IPA beta in the same
paper, because I don't work on Greek. However, I do deal with
formalizations of phonological theories. In describing grammatical
re-writing rules, it is conventional (among mathematicians and
computer scientists) to use (mathematical) Greek letters to stand for
strings of "letters" from the grammatical alphabet. If I'm giving a
phonological re-writing rule, how am I to distinguish the string
variable β from the phoneme symbol IPA beta?
Of course, it can be done by markup - but Unicode has gone to all the
trouble of encoding several maths alphabets, because font distinctions
are significant, and Unicode (don't ask me why) thinks people should
be able to write maths in plain text.
This is another case where what looks like a font distinction is a
semantically significant distinction, and should be encoded.
You're right that Latin a vs. IPA a is inconsistent - as I also said,
ideally they would be disunified - but the IPA itself, in a concession
to reality, has explicitly stated that the Latin letters of the IPA
are to be treated as the same as the ASCII characters.
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
This archive was generated by hypermail 2.1.5 : Thu Aug 20 2009 - 05:14:45 CDT