From: Kenneth Whistler (kenw@sybase.com)
Date: Wed Apr 05 2006 - 16:23:23 CST
Michael,
> If I am interested in searching the internet for all the examples of
> people who are discussing the use of the PLUMED HEAD, I cannot do so
> because there is nothing I can search for. Graphics are not
> searchable, and neither are PUA code points.
Don't head off into speciousness, however. Encoding a set of
Phaistos Disc signs in Unicode isn't going to change any one
of the millions of web citations of "Phaistos" (or however many
10's of thousands of them might actually contain one or more
actual graphic citations of one or more "characters" from the
disk) to make it Unicode-searchable text. And most of whatever
currently exists will *never* change.
All that would happen after a successful encoding of the
symbol set in Unicode would be that your CSUR page and some
other pioneering web pages would convert text over in a few
years, and you would end up with some hundreds (perhaps
eventually some thousands) of web pages which would clone
around from page to page the same single "text" and include some
discussion of it.
It is doubtful, in the long view, whether this represents
any true net gain for scholarship in this particular case,
over the simple ability to search for "Phaistos" and then
create summary sites that bring together links for the
more serious work (and ignoring the multitudinous crackpottery).
> >The "convenience" point is debatable, as I don't see the type of
> >keyboard gymnastics necessary for typing obscure codepoints to be
> >much more convenient than inserting pictures, in most cases.
>
> Encoded characters can be searched for. Pictures (in this sense) cannot.
Your argument would make more sense in the case of large
corpuses. But in the case of the Phaistos Disc, there is one
single document, of just 241 total glyph instances --
and every significant search turns up *exactly* the same text.
> >That's three mentions of web searches. I suspect that this is a
> >specious argument. What would be the point of searching on Phastian
> >(?!) "text", given that it has no agreed meaning? Even when
> >referring to existing academic references, would there be any point
> >to searching for such text?
>
> Not at all. No character or phrase in my own CSUR web page which
> contains the entire Phaistos Text can be searched for. Google ignores
> all of the characters.
This is, of course, erroneous. A search of "Phaistos Everson" not
only turns up the page, it ranks it #1 for that search in Google.
You are right that you can't do an internet search *into* the
Phaistos Test, but like Mike, I maintain that *that* is irrelevant
and pointless in this case. It is a *less* interesting search than
searching on combinations of "Phaistos" plus author/researcher,
for example.
> >>* the Phaistos Disc characters are used at least as much as most of
> >>the 40,000+ CJK-B characters
> >
> >I see no merit in this argument.
>
> I do.
Mike is right about this one, too.
>
> >Chinese is a productive writing system for which Unicode has no
> >productive model, therefore a large number of rarely used characters
> >will be encoded for completeness.
>
> I'm talking about thousands of characters which no one knows, which
> no one uses, and some of them were never real characters used outside
> of dictionaries.
The fact remains that a substantial majority of CJK Ext-B comes
*from* its usage in Chinese dictionaries, as demonstrated by the
55,812 kIRGHanyuDaZidian records in Unihan.txt. Shrugging that off
would be tantamount to shrugging off lexical evidence of English
words cited in the OED.
> >The value is measured not by the usage of a subset of characters,
> >but by the usage of the writing system. Also, since the Extension B
> >characters are genuine new adds, not compatibility characters, slow
> >acceptance is to be expected as fonts, typing methods, and SMP
> >support in general roll out slowly.
>
> No one uses them.
This is just false. They are used in digital editions of Chinese
classics, including, of course, the classic dictionaries.
>
> >Also, I wuld like to know what you mean by "used" here.
> >Specifically, how much of this is inline? Very little, I suspect.
>
> Our proposal gives some examples. It doesn't take a whole lot of
> imagination to see the utility of encoding these characters. And
> Mike, if you don't want to use Phaistos Disc characters, that's
> really fine with me. It's not an argument against their encoding,
> however.
Nor is imagining the utility of encoding them much of an argument
for their encoding.
Karl Pentzlin got to the heart of the issue, in my opinion:
> btw, I prefer the term "symbols" resp. "symbol set" to "characters"
> resp. "script", as thhe Phaistos Disc is not yet proven to show
> "characters" of a "script".
> (This again is no counterargument for encoding, as Unicode already
> contains a lot of symbols and symbol sets).
> Speaking of "symbols" makes the "undeciphered" argument irrelevant,
> as symbols stand for themselves.
> Personally, I see the reason for a encoding the Phaistos Disc symbol
> set like the reason for encoding Chess symbols (U+2654...U+265F)
> or Tai Xuan Jing symbols (U+1D300...U+1D35F): There is a well-defined
> symbol set which is needed by a wide user community in plain text.
If you and John Jenkins would stop talking about the Phaistos Disc
"script" and the "users of the script", there would be much less
controversy about encoding.
Furthermore, your suggested properties aggravate the problem,
by claiming that these are all "letters" (gc=Lo). That claim
has undesirable implications -- for example, they would automatically
be included in the definition of identifiers.
The straightforward way to approach a Unicode encoding is to
encode the Phaistos 45 sign list *as* a sign list and be done
with it. 45 symbols, used in discourse about the decipherment
of this thingum from Crete. Add one more oblique stroke symbol
for the people that want to talk about that mark on the basic
45 signs. The encodings of "punctuation" for a sign separator and
start of text are just bogus. Those are artifacts of unrolling
the delineation of text elements from the disc, and are no more
needed encoded as "Phaistos script" punctuation characters than we
needed characters to represent the lines that box out text on
cuneiform tablets.
So no script assignment.
No letters. No punctuation.
45 + 1 signs. gc=So. PHAISTOS DISC SIGN YADDA YADDA
And you'd have an encoding that would sail through.
But what you and John are attempting here to turn an undeciphered
artifact into a Unicode script and letters and punctuation that
text processes implemented in Unicode systems would then
start interacting with is -- and I say it again -- just bogus.
--Ken
This archive was generated by hypermail 2.1.5 : Wed Apr 05 2006 - 16:32:55 CST