RE: [BULK] - Re: Interleaved collation of related scripts

From: Mike Ayers (mike.ayers@tumbleweed.com)
Date: Fri May 14 2004 - 13:50:20 CDT

  • Next message: Peter Kirk: "Re: [BULK] - Re: Interleaved collation of related scripts"

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
    > Behalf Of Peter Kirk
    > Sent: Friday, May 14, 2004 6:35 AM

    > On 13/05/2004 14:33, Kenneth Whistler wrote:
    >
    > >Peter Kirk noted:

    > >...
    > >
    > >Mike Ayers is on the right track here, I believe. The scenarios
    > >which people are adducing in arguing for interfiling should
    > >be addressed instead by appropriately designed normalizations --
    > >which can be implemented using fairly easy-to-program,
    > >reusable scripts. Then sort on the *normalized* data using
    > >a much, much simpler collation table to accomplish what you
    > >need.

    > Mike Ayers suggested that users should write Perl scripts.

            Liar. I never advocate Perl, except as a final, desperate measure.
    Nor did I say that anyone needed to write scripts. Normalization is
    something that can and should be done by text processing applications -
    users should only need to make the normalization tables.

    > This is
    > something which computer geeks may be able to do, but it is simply
    > impossible for the rest of humanity including scholars of ancient
    > languages.

            This statement is grossly inconsistent with most linguistic scholars
    I have met, directly or indirectly, and inconsstent with the part of general
    humanity have met. Have I really been that fortunate?

    > Perl is not "God's gift to academic researchers"
    > in general,
    > although it may be God's gift to computer geeks.

            Perl isn't God's gift to anything - it's Larry's gift to
    cyberbrutality (which is, in its own twisted way, a Beautiful Thing).

    > The other problem with this is that the large corpora to be
    > searched are
    > not necessarily directly available to the users for normalisation. I
    > can't normalise the whole Internet before doing a Google search for a
    > Coptic or Phoenician word. What I need is a search engine
    > which can (at
    > least as a tailoring) collate together Coptic and Greek,
    > Phoenician and
    > Hebrew.

            Your issues with Google searching are best taken up with the folks
    at Google. The default collation properties aren't goin to matter there, I
    suspect.

    > It really would be
    > far better,
    > in the long run, if you said openly that anyone who continues
    > to write
    > Phoenician with Hebrew characters after the new block is accepted is
    > wrong and breaking the standard, and should change their practices
    > immediately.

            No, it wouldn't. I know I should add something, but I'm just too
    awed by the gap between reality and the perception you seem to have of it on
    this issue.

    > But then if you said that you would of course add a lot more flame to
    > the fire, and you would be forced to consider properly whether such
    > proposals as the separate Phoenician script have consensus
    > support from
    > the majority of regular professional users of the script.

            ...implying that a proper job is not being done now? I beg to
    differ.

    /|/|ike



    This archive was generated by hypermail 2.1.5 : Fri May 14 2004 - 13:51:15 CDT