Re: ZWJ, ZWNJ and VS in Latin and other Greek-derived scripts

From: vunzndi@vfemail.net
Date: Fri Jan 26 2007 - 18:23:01 CST

  • Next message: Ruszlan Gaszanov: "RE: ZWJ, ZWNJ and VS in Latin and other Greek-derived scripts"

    No degree of editorial intervention can fully compensate for a typist
    lack of knowledge of a ligature, though. First one needs to resolve
    the question of how to encode something. Once this is resolved there
    are many ways that the problem of input can be solved, where large
    amounts of literaure are involved it is always worth consider OCR
    (Optical Character Recognition) which blackletter ttype would seem not
    to pose any great problems. Of course one also neeeds a way to type,
    if only to correct OCR mistakes. Many ebooks are created in this way
    -- OCR, then correct by hand. With literature over 100 years old
    copyright is unlikely to be an issue, and printed docuemnts are
    realtively easy to OCR compare to handwwritten ones. One would need an
    OCR that can learn, this would take time to set up but once done would
    speed the whole process a great deal.

    John

    Quoting Asmus Freytag <asmusf@ix.netcom.com>:

    >>
    >> Now let's say that I have a text in typical modern German, which I
    >> decide I want to display in blackletter type (noting your accurate
    >> objection to use of the specific term fraktur). What degree of this
    >> conversion should I be able to rely on to be automated, and what
    >> degree will require editorial intervention *in the text*? I don't
    >> know the answer to that question, and I suspect it is something
    >> that could generate a good deal of debate.
    >>
    > The rules for the use of long s, and for ligatures (in German), both
    > require that you know the word boundaries inside a compound word. As
    > has been demonstrated on this list many times, there are cases where
    > even dictionary-based approaches must fail, because the same string of
    > letters can represent two different compound words (with different
    > location of the boundary).
    >

    -------------------------------------------------
    This message sent through Virus Free Email
    http://www.vfemail.net



    This archive was generated by hypermail 2.1.5 : Fri Jan 26 2007 - 18:25:44 CST