From: Mike Ayers (mike.ayers@tumbleweed.com)
Date: Fri May 14 2004 - 13:50:20 CDT
> From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
> Behalf Of Peter Kirk
> Sent: Friday, May 14, 2004 6:35 AM
> On 13/05/2004 14:33, Kenneth Whistler wrote:
>
> >Peter Kirk noted:
> >...
> >
> >Mike Ayers is on the right track here, I believe. The scenarios
> >which people are adducing in arguing for interfiling should
> >be addressed instead by appropriately designed normalizations --
> >which can be implemented using fairly easy-to-program,
> >reusable scripts. Then sort on the *normalized* data using
> >a much, much simpler collation table to accomplish what you
> >need.
> Mike Ayers suggested that users should write Perl scripts.
Liar. I never advocate Perl, except as a final, desperate measure.
Nor did I say that anyone needed to write scripts. Normalization is
something that can and should be done by text processing applications -
users should only need to make the normalization tables.
> This is
> something which computer geeks may be able to do, but it is simply
> impossible for the rest of humanity including scholars of ancient
> languages.
This statement is grossly inconsistent with most linguistic scholars
I have met, directly or indirectly, and inconsstent with the part of general
humanity have met. Have I really been that fortunate?
> Perl is not "God's gift to academic researchers"
> in general,
> although it may be God's gift to computer geeks.
Perl isn't God's gift to anything - it's Larry's gift to
cyberbrutality (which is, in its own twisted way, a Beautiful Thing).
> The other problem with this is that the large corpora to be
> searched are
> not necessarily directly available to the users for normalisation. I
> can't normalise the whole Internet before doing a Google search for a
> Coptic or Phoenician word. What I need is a search engine
> which can (at
> least as a tailoring) collate together Coptic and Greek,
> Phoenician and
> Hebrew.
Your issues with Google searching are best taken up with the folks
at Google. The default collation properties aren't goin to matter there, I
suspect.
> It really would be
> far better,
> in the long run, if you said openly that anyone who continues
> to write
> Phoenician with Hebrew characters after the new block is accepted is
> wrong and breaking the standard, and should change their practices
> immediately.
No, it wouldn't. I know I should add something, but I'm just too
awed by the gap between reality and the perception you seem to have of it on
this issue.
> But then if you said that you would of course add a lot more flame to
> the fire, and you would be forced to consider properly whether such
> proposals as the separate Phoenician script have consensus
> support from
> the majority of regular professional users of the script.
...implying that a proper job is not being done now? I beg to
differ.
/|/|ike
This archive was generated by hypermail 2.1.5 : Fri May 14 2004 - 13:51:15 CDT