From: Erkki Kolehmainen (erkki.kolehmainen@kotus.fi)
Date: Tue Apr 11 2006 - 02:12:54 CST
I tend to agree with Ken on this one, too.
Regards, Erkki I. Kolehmainen
Kenneth Whistler wrote:
> Michael,
>
>
>>If I am interested in searching the internet for all the examples of
>>people who are discussing the use of the PLUMED HEAD, I cannot do so
>>because there is nothing I can search for. Graphics are not
>>searchable, and neither are PUA code points.
>>
>
> Don't head off into speciousness, however. Encoding a set of
> Phaistos Disc signs in Unicode isn't going to change any one
> of the millions of web citations of "Phaistos" (or however many
> 10's of thousands of them might actually contain one or more
> actual graphic citations of one or more "characters" from the
> disk) to make it Unicode-searchable text. And most of whatever
> currently exists will *never* change.
>
> All that would happen after a successful encoding of the
> symbol set in Unicode would be that your CSUR page and some
> other pioneering web pages would convert text over in a few
> years, and you would end up with some hundreds (perhaps
> eventually some thousands) of web pages which would clone
> around from page to page the same single "text" and include some
> discussion of it.
>
> It is doubtful, in the long view, whether this represents
> any true net gain for scholarship in this particular case,
> over the simple ability to search for "Phaistos" and then
> create summary sites that bring together links for the
> more serious work (and ignoring the multitudinous crackpottery).
>
>
>>>The "convenience" point is debatable, as I don't see the type of
>>>keyboard gymnastics necessary for typing obscure codepoints to be
>>>much more convenient than inserting pictures, in most cases.
>>>
>>Encoded characters can be searched for. Pictures (in this sense) cannot.
>>
>
> Your argument would make more sense in the case of large
> corpuses. But in the case of the Phaistos Disc, there is one
> single document, of just 241 total glyph instances --
> and every significant search turns up *exactly* the same text.
>
>
>
>>>That's three mentions of web searches. I suspect that this is a
>>>specious argument. What would be the point of searching on Phastian
>>>(?!) "text", given that it has no agreed meaning? Even when
>>>referring to existing academic references, would there be any point
>>>to searching for such text?
>>>
>>Not at all. No character or phrase in my own CSUR web page which
>>contains the entire Phaistos Text can be searched for. Google ignores
>>all of the characters.
>>
>
> This is, of course, erroneous. A search of "Phaistos Everson" not
> only turns up the page, it ranks it #1 for that search in Google.
>
> You are right that you can't do an internet search *into* the
> Phaistos Test, but like Mike, I maintain that *that* is irrelevant
> and pointless in this case. It is a *less* interesting search than
> searching on combinations of "Phaistos" plus author/researcher,
> for example.
>
>
>>>>* the Phaistos Disc characters are used at least as much as most of
>>>>the 40,000+ CJK-B characters
>>>>
>>>I see no merit in this argument.
>>>
>>I do.
>>
>
> Mike is right about this one, too.
>
>
>>>Chinese is a productive writing system for which Unicode has no
>>>productive model, therefore a large number of rarely used characters
>>>will be encoded for completeness.
>>>
>>I'm talking about thousands of characters which no one knows, which
>>no one uses, and some of them were never real characters used outside
>>of dictionaries.
>>
>
> The fact remains that a substantial majority of CJK Ext-B comes
> *from* its usage in Chinese dictionaries, as demonstrated by the
> 55,812 kIRGHanyuDaZidian records in Unihan.txt. Shrugging that off
> would be tantamount to shrugging off lexical evidence of English
> words cited in the OED.
>
>
>
>>>The value is measured not by the usage of a subset of characters,
>>>but by the usage of the writing system. Also, since the Extension B
>>>characters are genuine new adds, not compatibility characters, slow
>>>acceptance is to be expected as fonts, typing methods, and SMP
>>>support in general roll out slowly.
>>>
>>No one uses them.
>>
>
> This is just false. They are used in digital editions of Chinese
> classics, including, of course, the classic dictionaries.
>
>
>>>Also, I wuld like to know what you mean by "used" here.
>>>Specifically, how much of this is inline? Very little, I suspect.
>>>
>>Our proposal gives some examples. It doesn't take a whole lot of
>>imagination to see the utility of encoding these characters. And
>>Mike, if you don't want to use Phaistos Disc characters, that's
>>really fine with me. It's not an argument against their encoding,
>>however.
>>
>
> Nor is imagining the utility of encoding them much of an argument
> for their encoding.
>
> Karl Pentzlin got to the heart of the issue, in my opinion:
>
>
>>btw, I prefer the term "symbols" resp. "symbol set" to "characters"
>>resp. "script", as thhe Phaistos Disc is not yet proven to show
>>"characters" of a "script".
>>(This again is no counterargument for encoding, as Unicode already
>>contains a lot of symbols and symbol sets).
>>Speaking of "symbols" makes the "undeciphered" argument irrelevant,
>>as symbols stand for themselves.
>>Personally, I see the reason for a encoding the Phaistos Disc symbol
>>set like the reason for encoding Chess symbols (U+2654...U+265F)
>>or Tai Xuan Jing symbols (U+1D300...U+1D35F): There is a well-defined
>>symbol set which is needed by a wide user community in plain text.
>>
>
> If you and John Jenkins would stop talking about the Phaistos Disc
> "script" and the "users of the script", there would be much less
> controversy about encoding.
>
> Furthermore, your suggested properties aggravate the problem,
> by claiming that these are all "letters" (gc=Lo). That claim
> has undesirable implications -- for example, they would automatically
> be included in the definition of identifiers.
>
> The straightforward way to approach a Unicode encoding is to
> encode the Phaistos 45 sign list *as* a sign list and be done
> with it. 45 symbols, used in discourse about the decipherment
> of this thingum from Crete. Add one more oblique stroke symbol
> for the people that want to talk about that mark on the basic
> 45 signs. The encodings of "punctuation" for a sign separator and
> start of text are just bogus. Those are artifacts of unrolling
> the delineation of text elements from the disc, and are no more
> needed encoded as "Phaistos script" punctuation characters than we
> needed characters to represent the lines that box out text on
> cuneiform tablets.
>
> So no script assignment.
>
> No letters. No punctuation.
>
> 45 + 1 signs. gc=So. PHAISTOS DISC SIGN YADDA YADDA
>
> And you'd have an encoding that would sail through.
>
> But what you and John are attempting here to turn an undeciphered
> artifact into a Unicode script and letters and punctuation that
> text processes implemented in Unicode systems would then
> start interacting with is -- and I say it again -- just bogus.
>
> --Ken
This archive was generated by hypermail 2.1.5 : Tue Apr 11 2006 - 15:11:59 CST