Re: [Corpora-List] fast scanner?

From: Sampo Nevalainen (samponev@cc.joensuu.fi)
Date: Thu Mar 13 2003 - 15:05:50 MET

  • Next message: Burnard Towers: "RE: [Corpora-List] BNC in XML?"

    I suppose that the effectiveness of the software used (both the scanner
    software and the OCR program) is at least as important as if not even more
    important than the speed of the scanner... I observed that when I changed
    to another OCR program (FineReader OCR system from ABBYY), the total time
    used for scanning decreased significantly. This is not an advertisement,
    but I am quite satisfied with FineReader, which was actually acknowledged
    as one of the three best OCR systems by PC Magazine (January, 20, 1998). It
    is quite fast and accurate, and recognizes a couple of hundred languages
    (including those that use Cyrillic alphabet); there's also spell checking
    avalable for several languages. I do not know for sure if FinerReader
    supports Arabic script, but at least at
    http://www.translation.net/arabsoft.html it is also listed under arabic
    scanning software. (See, for example, http://tev.itc.it/OCR/Products.html
    for other OCR programs.)

    At 11:15 13.3.2003 +0000, Eric Atwell wrote:
    >A super-fast scanner isn't much use without good OCR software - could I
    >please "piggy-back" on this query: can anyone recommend
    >good OCR (and/or hadwritten script recognition?) for Arabic, and other
    >languages written in Arabic script, to use with a fast scanner in corpus
    >collection?
    >
    >thanks
    >
    >eric atwell, Leeds University

    ( : ============================================= : )

    Sampo Nevalainen, M.A.
    Researcher
    University of Joensuu
    Savonlinna School of Translation Studies
    P.O.Box 48
    FIN-57101 Savonlinna
    FINLAND

    tel +358-15-511 70 (operator)
             +358-15-511 7704
    fax +358-15-515 096
    email samponev@cc.joensuu.fi
    http://kvl.joensuu.fi



    This archive was generated by hypermail 2b29 : Thu Mar 13 2003 - 15:10:14 MET