Re: PH technical issues (was RE: Why Fraktur is irrelevant

From: Peter Kirk (peterkirk@qaya.org)
Date: Fri May 28 2004 - 15:40:11 CDT

  • Next message: Peter Constable: "RE: PH technical issues (was RE: Why Fraktur is irrelevant"

    On 28/05/2004 11:08, Peter Constable wrote:

    > ...
    >
    >>To me the answer to this argument is simple: plain text is intended to
    >>communicate semantic content only...
    >>
    >>
    >
    >The problem with your response is that by-passes the question of what
    >should reasonably be considered semantic content. The premise of the
    >argument -- that the semantic content is the same -- is not valid as a
    >premise because it is precisely what is at stake. Your argument is
    >therefore circular: PH and sq H should not be considered semantically
    >different because they are not semantically different.
    >
    >

    Well, I understood the semantic content of a text to be the meaning of
    the words, not the indication of which script they are written in. Well,
    maybe it is not that simple, as in Latin script an all capitals word has
    more or less the same meaning as it does in lower case (as long as it is
    not German or with Fraktur glyphs, apparently). But a Hebrew or Moabite
    word has the same meaning whether it is written with Hebrew or
    Phoenician glyphs. That was my argument. Now you may wish to argue that
    plain text is intended to convey more information than that, also the
    information about what script it is written in, but again that begs the
    question about the what is a script distinction.

    >Obviously, we can choose to decide -- and will be so deciding -- whether
    >encoded characters for PH and sq Heb are considered to have the same or
    >different semantic content -- i.e. whether they are the same encoded
    >characters or different encoded characters. Our decision cannot be based
    >on a premise that they do or do not. It must be based on factors such
    >as:
    >
    >- whether users *perceive* the semantic content to be the same or
    >different (as that will affect their expectations of how IT systems
    >behave)
    >
    >- whether the IT needs of users overall will be best served by
    >considering the semantic content to be the same or different.
    >
    >

    I have argued that their perceptions will be well met and their needs
    will be well served by considering the semantic content to be subtly
    different in some clearly defined way; but they will not be well met and
    well served by considering the semantic content to be totally different
    and unrelated. So I look for a way in which the close semantic
    relationship between plain text Phoenician and plain text Hebrew is
    clearly specified, while accepting that for some purposes it is good to
    make some semantic distinction.

    >...
    >Her "Phoenician words" in this case are probably something like her
    >name, or a transliteration of English words. The suggestion is that they
    >would not be meaningful to her in square Hebrew glyphs, but that
    >suggestion (like your argument) presupposes what is considered semantic
    >content.
    >
    >
    >
    Is it really in the scope of Unicode to encode such trivialities? I have
    a key ring with my name "written" in an Egyptian hieroglyphic
    pseudo-alphabet. Will such abuse of Egyptian hieroglyphs have to be
    taken into account in the possible Unicode proposal for this script?
    Children invent all kinds of alphabets in which to write their names;
    will all of these have to be encoded in Unicode? If they want to go
    beyond what is current and useful in Unicode, let them use the PUA.

    >...
    >
    >
    >>If she wants to control its
    >>appearance, she should use graphics or PDF format, or at least HTML
    >>which will specify the font used on her computer
    >>
    >>
    >
    >So, your rebuttal is the same as David Starner's: use markup or non-text
    >representation. As I said to him, I think that is a greater
    >inconvenience to these users than character folding is to the Semitic
    >paleographers that consider the semantics the same.
    >
    >
    >
    Yes, and I disagree. It is highly unlikely that Sally will not already
    be using "rich text", with markup hidden from her. So it is no
    inconvenience to include that in her e-mail, indeed more of an
    inconvenience to deliberately remove it.

    >
    >
    >>The scenario you are looking at is actually a rather unlikely one.
    >>
    >>
    >
    >Fine. So you reject this scenario. Let's move on. As I said, it was just
    >an example that tried to get away from paleography. You and David have
    >said to use markup or don't use text. I've suggested that's probably a
    >greater burden than character folding. I think we should then try to
    >consider whether other more reasonable scenarios would lead us to accept
    >or reject that position.
    >
    >
    >
    Well, if anyone has another scenario to propose, let's see it.

    >
    >
    >
    >>On 28/05/2004 02:56, Christopher Fynn wrote:
    >>
    >>
    >
    >Is there a reason why you have started grouping your responses together?
    >I don't at all care for it.
    >
    >
    >
    I have been asked to restrict my number of postings to this list, but
    not their length. That is why. If you don't like it, ask the list
    moderators to change their policy.

    >
    >
    >>>If this is "trivial" for scholarly users then using a tailoring to
    >>>achieve interleaved collation and / or folding wouldn't be difficult
    >>>for them either.
    >>>
    >>>
    >>I disagree. Tailoring is possible, but it is far more complicated than
    >>adding a script or scribe name tag to a database.
    >>
    >>
    >
    >I know that you use Shoebox or Toolbox for linguistic corpus data. In
    >that context, this amounts to setting up a new language configuration,
    >or a new sort order and character classes for a given language. If you
    >do the former, you will end up doing the latter. So, for you, both are
    >possible and comparable in terms of difficulty.
    >
    >
    >
    Well, I have used Shoebox and Toolbox. I have also used your company's
    products, which at least allow me to add a script name field to my
    database but don't allow me to tailor collations. But I was thinking in
    terms of tailored collation weights for the Unicode collation algorithm.
    These are much more complex than setting up a new language configuration
    for Shoebox or Toolbox.

    >
    >
    >>Anyway, D. Starner's
    >>requirement for detailed script marking will not be met by defining a
    >>separate Phoenician script.
    >>
    >>
    >
    >That's not really relevant. No matter what we decide here, the situation
    >David has described would still require detailed script marking. That is
    >*not* a problem scenario that the PH proposal (or any proposal for *any*
    >script) was intended to solve.
    >
    >
    >
    My point precisely.

    On 28/05/2004 11:21, Mike Ayers wrote:

    > ...
    >
    > > And it would have worked with HTML e-mail, as long as Latisha had the
    > > same fonts installed as Sally. This does not take more work
    > > to prepare.
    >
    > Debatable. However, for mailing lists, there exist strip-bots
    > that remove all HTML and/or RTF markup. With increasing problems with
    > email borne viri and vulnerabilities in/due to HTML, the prospect of
    > increased strip-bot activity is very real. HTML is, as it is has
    > always been, not a fix-all.
    >
    >
    Well, there is no such strip-bot on this list. And the scenario
    mentioned nothing about lists. There is a clear user demand to be able
    to send e-mail with font markup. If HTML proves to be an inappropriate
    technology for this, an alternative will surely appear.

    On 28/05/2004 11:28, Mike Ayers wrote:

    > ...
    >
    > Ummm - let me get this right. Some people who are using these
    > characters tell us that they need to fundamentally distinguish them
    > from Hebrew characters, but that's not a good case. A hypothetical
    > situation, however, could convince you? I'm truly baffled.
    >
    >
    A. I need to distinguish Phoenician characters from Hebrew.

    B. OK, let's evaluate your requirement, to see if you really need a
    plain text distinction, or if your need can be met by markup. Please
    give us a sample scenario in which you need to make a distinction.

    A. No. I say that I need a distinction, and you can't prove that I don't.

    B. That is not acceptable. You need to give us a sample scenario, and
    convince us that in this scenario a plain text distinction is needed.
    Otherwise your proposal will not be acceptable.

    A. I'm not talking to you guys any more. Goodbye.

    Do you understand now what I am asking for? Simply evidence in favour of
    the original proposal, something which can be evaluated, instead of a
    lot of personal opinions and hearsay. I am not looking for a
    hypothetical situation. I am looking for a real one, in which someone is
    inconvenienced, or likely to be so in future, by the lack of a plain
    text distinction. Sally and Latisha is the nearest anyone has come to
    this. Perhaps we can get further. But it would really help if this could
    come from the proposer, so that he wouldn't have to be second-guessed by
    others who don't know the situation so well.

    -- 
    Peter Kirk
    peter@qaya.org (personal)
    peterkirk@qaya.org (work)
    http://www.qaya.org/
    


    This archive was generated by hypermail 2.1.5 : Fri May 28 2004 - 15:44:12 CDT