From: Michael Everson (everson@evertype.com)
Date: Mon Aug 17 2009 - 17:43:46 CDT
On 17 Aug 2009, at 22:25, Asmus Freytag wrote:
> Well, for 20 years, more or less, Unicode has explicitly claimed
> they are. Millions of documents exist that use Unicode-encoded IPA.
As many or more millions of documents use ASCII font-hacked IPA.
Millions of documents use ʤ and ʧ which are now deprecated by the IPA
in favour of dʒ and tʃ. The three characters in question are a
persistent problem for users and for the designers who wish to supply
fonts to them.
Don't think that this is the first time in 20 years that these
characters have been discussed as problematic. In fact, I raised the
issue here in September 1997. In August 2005 I discussed the matter
with John Esling (editor of the Journal of the International Phonetic
Association), who said (about IPA beta, chi, delta, and theta) "As far
as I'm concerned, all 4 are different from Greek."
> An untold number of them uses one or more of these three characters.
> Thousands of users have figured out some way to enter these
> characters, presumably, in many cases, by using Greek keyboard
> layers. Then there are the common tools they use to search, sort and
> otherwise process such data.
And thousands more, Asmus, have had to substitute in a glyph from a
different font than the one they are using for their main text because
their main-text font had Greek letterforms that were not appropriate
for their purposes. Believe me, the problems caused by this mistaken
unification are persistent.
> Because of that, the question is no longer as simple as you state
> it. No longer can you simply focus on the ideal situation, but you
> also have to consider what will happen to all of these existing
> documents, as well as to the existing user base, and the tools they
> use.
Yes, and in considering I still come to the same conclusion: we need
to disunify.
> By the seemingly innocuous fact of adding new character codes, you
> are also changing the identity of the existing character codes.
Not at all. In the old days, we used to pull off the IBM Selectric
ball to change from Latin to Greek. Then we used to change 8-bit fonts
mid-word to get the Greek character we needed. Any encoded text that
has a Greek beta in it will simply be a text with a Greek beta in it.
That doesn't mean that the right thing to do (then as now) is to
correct the problem-causing false unification, and move on. Old data
might be updated, or it might not. But Julian has aptly pointed out
that people searching IPA need to know the encoding AND the
orthographic practice for any given document and it is *never* the
same kind of straightforward "spelling". A transcription may be broad
or narrow, and that alone causes ambiguity. I might write Julian's
name as [ʤuːljən] or as [dʒuːliən]. There is no real difference
between this orthographic practice and the "cost" of disunification of
Latin beta from Greek beta.
> But habits of 20 years die slowly, so you can expect that some
> significant section of the user community will continue the old
> characters for new work (while all the old documents will survive
> unchanged).
A huge proportion of the user community still uses their old 8-bit
Macintosh or PC fonts because they work. This is true for IPA, and for
Cuneiform, and for lots of scripts.
> Your simple question needs to be restated:
> What is more appropriate:
> a) continue as before
Surely not.
> b) a disruptive continuity
You mean, rather, a solid solution with a non-trivial but acceptable
cost.
> c) a less disruptive encoding that embodies a clean fallback
>
> My vote is for "c" (realized via VS).
> In my take, the "latinization" of these characters in the IPA
> context is primarily a typographical issue.
It never was intended to be "merely typographical", and as you see
Esling disagreed with such a view. The problem was the mistaken
unification of all but a few Greek characters.
> There's little else that distinguishes them from ordinary
> alphabetical characters that are simply part of a special notation.
It's pseudo-encoding, and whilst it might be a suitable mechanism for
CJK or maths or unusual finals in Manichaean, I do not believe at all
that it is a suitable mechanism to introduce into something as
important and widespread as the IPA.
> They are not used in contrast with other shapes of the same letters.
I found examples once which got us LATIN SMALL LETTER DELTA encoded.
In my previous posting I mentioned text by Abercrombie in which he
contrasted the Greek forms with the Latin IPA forms. Some of those
letters are disunified, some unified. That's illogical; it was a bad
decision; we need to fix this now, as it continues to be a problem.
> Using a VS allows that to be expressed, retains compatible support
> for all tools, and users who coninue in the old ways are "punished"
> by poor typography rather than by failed searches.
No different than searches involving [ʤ] or [dʒ].
Michael Everson * http://www.evertype.com/
This archive was generated by hypermail 2.1.5 : Mon Aug 17 2009 - 17:45:55 CDT