From: Kenneth Whistler (kenw@sybase.com)
Date: Thu May 29 2003 - 16:35:13 EDT
Kent:
> Others gave references where it in most cases did NOT look at all like the
> empty set symbol.
Gustav Leunbach (1973), Morphological Analysis as a Step in
Automated Syntactic Analysis of a
Text.http://acl.ldc.upenn.edu/C/C73/C73-2022.pdf
uses an empty set symbol to denote a morphological zero.
(see p. 272). [Typographically, this could arguably
have been taken from a type tray for a Norwegian ø
character, rather than from a mathematical symbol font,
but this is *clearly* not a slashed zero.] And this is
a document type set the old fashioned way, with actual type,
in 1973. See:
http://acl.ldc.upenn.edu/C/C73/C73-2000.pdf
bearing the publication logo of Firenze.
A. S. Liberman (1973), Towards a Phonological Algorithm.
http://acl.ldc.upenn.edu/C/C73/C73-1015.pdf
uses an empty set symbol to denote a phonological zero.
(See pp. 196-197 for numerous examples.) These are
clear examples, and show that this is used symbolically,
to indicate a "something which is not there". Look at
the type style. These are included in *italic* word
citations, but the null set symbol (used to denote the
phonological zero), is *not* set in italic.
Harri Jäppinen and Matti Ylilammi (1986), Associative Model
of Morphological Analysis: An Empirical Inquiry.
http://acl.ldc.upenn.edu/J/J86/J86-4001.pdf
Displays a distinctive usages, with an italic epsilon to
denote a morphological zero. (Not the same as the set theory
use of epsilon to denote set membership.)
You can dig further in these archives of old editions of
Computational Linguistics and other journals from the 1970's
to find other instances illustrating the use of the empty
set symbol in linguistics to denote a phonological or
morphological zero.
> From what I've heard on this thread, a slashed zero glyph appears common
> in this situation in linguistics.
See examples cited above.
> A slashed zero is completely
> unrelated to the empty set symbol.
This is nonsense. You have found the correct citations
on the web regarding André Weil's claim to have introduced
the empty set symbol, as part of the Bourbaki group. And
for Weil, the source of the symbol may well be Norwegian ø.
(What the Weil citation doesn't specify is why he chose
a symbol vaguely reminiscent of a zero, while not actually
being a zero, to represent the empty set.) And what I pointed
out earlier is that, in *linguistic* usage, the slashed zero
glyph is clearly an acceptable glyphic variant of the
empty set symbol. So to claim it is "completely unrelated"
is to manifestly ignore actual practice.
> The empty set symbol and slashed zero remain unrelated.
Another bald assertion contradicted by Pullum (1996), who
*does* relate them, in linguistic usage. Nobody is claiming
that in *mathematical* usage they are connected, or would
be acceptable alternative glyphs in a treatise on set theory.
> [The EMPTY SET symbol] does not appear to have wandered
> into linguistics in any way (except by occasional typographic mistake,
> and that does not count), even though there is use of a similar-looking
> symbol.
What you are missing here is that the use of the empty set
symbol in linguistics is associated with structuralist
linguistics, which was in intellectual development roughly
contemporaneously with the Bourbaki group. And structuralist
morphology, in particular, was influenced by formal set
theory, and many morphologists borrowed the kind of formalisms
used by logicians and set theoreticians.
A phonological zero or a morphological zero has nothing to
do with numeric values, nor is it conceived of as part of
a word, per se. It is a pattern gap, an absence, a set with
no elements. And while I can't track you back, from web
citations to some earliest usage and give you a morphologist
explicitly talking about his notational conventions, without
spending more time at it than I can manage today, I can
assure you that it is perfectly reasonable and expected to
find clear examples of use of the empty set symbol in this
linguistic usage.
> there does not appear to be a history of borrowing the empty set sign
> into linguistics.
This is erroneous.
Your mistake is to assume that this derives from some kind
of transcriptional usage. It does not. It comes from
pattern analysis of structural systems, by structuralist
linguists influenced by mathematical formalism and set theory,
among other things.
> Then use a slashed zero (<DIGIT ZERO, COMBINING LONG SOLIDUS OVERLAY>;
> which seems to be the glyph for which is actually used in print,
> and the problem is solved!
This suggestion is completely bogus. The {slashed-zero} glyph
is an acceptable glyph variant (in different contexts) of
either U+0030 or U+2205. It is not some distinct entity which
needs to be represented by a combining sequence.
> I'll get over it when you find a reference (published pre-2003)
The Liberman (1973) citation above beats that by 30 years.
> that explicitly
> (in words!) say that they use the empty set sign for this, and
The typography in that document *clearly* indicates that this
is an empty set symbol, and not a digit or just another letter
of a word's transcription.
The paper doesn't *have* to explain that in words. It isn't about
nomenclature or symbology, but about phonological analysis. The
usage was a *self-evident* convention to Liberman's linguist
readers in 1973.
> preferably also
> show that this is the history of that use.
As you know, it is a lot harder to track down written documents
about the history of symbolic usage in particular instances,
than it is to track down actual usage by people who just use
the symbols.
> Then I promise to be very quiet (and nod ok)! ;-)
Please read Liberman, and then be very quiet and nod ok.
> (I would still quietly wonder why a click letter,
> looking like !,
> and the integral sign letter (small esh), got their own letter (Ll)
> codes...)
Because those *are* transcriptional characters. Each represents
a specific sound, and they are used as letters in transcription.
The click (!) comes from preexisting Africanist practice. It is,
indeed, an intentional disunification of character from U+0021,
because of character properties. Since the click *is* part of
words in orthographies that use it, having a separate letter for
it makes implementation sense.
U+0283 LATIN SMALL LETTER SMALL ESH is likewise a *letter*
in transcription (and some orthographies). It is derived
historically from U+017F LATIN SMALL LETTER LONG S. I don't
know what the origin of the mathematical integral sign
(U+222B) is, so cannot vouch for whether it is graphologically
connected to the long s or not, but it is clearly distinct
in usage and properties from the IPA esh.
But the phonological/morphological zero is *NOT* a letter
of transcription. It is a symbol which appears in phonological
and morphological analysis. Morphologists also embed other
symbols in such analyses, including juncture symbols such
as "-", "+", "#", "=", and so on. But such practice does
not make those symbols letters, either.
--Ken
This archive was generated by hypermail 2.1.5 : Thu May 29 2003 - 17:25:30 EDT