Re: [Corpora-List] Interlingual Machine Translation Systems (fwd)

From: Yorick WIlks (yorick@dcs.shef.ac.uk)
Date: Sun Nov 21 2004 - 15:50:44 MET

  • Next message: Yuri Tambovtsev: "[Corpora-List] send me e-mail addresses of some good English courses"

    I did evaluations of SYSTRAN for the US AirForce back in 1980 or so and
    it was getting then about 70-80 % of Russian sentences acceptably
    correct into English and about 60% of other languages it did. SYSTRAN
    has now been working for the European Commission at Luxemburg
    translating between English/French/German for about 25 years and does
    millions of words a week-they could not possibly function without it. I
    am afraid your 1% guess has no relation to the facts at all---look at
    the Commission's website.
    YW

    On Sunday, November 21, 2004, at 09:53 AM, Sergey Protasov wrote:

    > Ok, Yorick,
    >
    > We should define the term "good" of MT systems.
    >
    > If we take arbitraty sentences from some very big not specialized
    > english corpus and translate it, using expert-man-translator, we have
    > about 80-90% correctly translated sentences.
    > Let's define this as the best quality of translation.
    >
    > So "good" translation is about 45-50% of correct sentences.
    >
    > And satisfactory - about 20% of correct sentences. (logarithm scale)
    >
    > I think, Systran and any other MT system can translate correctly not
    > more than one percent of sentences, arbitrary selected from big > corpus.
    >
    > This is not "good" in any case, IMHO.
    >
    > If I wrong about 1% - let me know please.
    >
    > Im sorry for my bad english.
    >
    >
    > Yorick WIlks wrote:
    >> This reply from Russia is total nonsense, unless "good" means
    >> something utterly impractical. There are many evaluated MT systems
    >> that do a reasonable job (i.e. giving a good indication of what a
    >> document says) and some are available free on search sites as well
    >> all know. The world's oldest and strongest system SYSTRAN sometimes
    >> does a very good job. recommending a 20 word MT system shows utter
    >> ignorance of the last forty years.
    >> Yorick Wilks
    >> On Friday, November 19, 2004, at 12:18 PM, Sergey Protasov wrote:
    >>>
    >>> Eric,
    >>>
    >>> There are no good MT systems today at all.
    >>> So there are no good opensource MT systems today.
    >>>
    >>> ThoughtTreasure is very big system for teaching and it have bad
    >>> syntax parser. (It fails, if senstence have more that 7-10 words)
    >>>
    >>> I recommend you to see link grammar translator for teaching.
    >>> http://www.link.cs.cmu.edu/link/submit-to-translator.html
    >>> It show very good translations, but It have 20 words in vocab only..
    >>>
    >>> You can add more words... But it is not trivial...
    >>>
    >>> If you intresting in statistical mashine translation, forget I said
    >>> before and go to here
    >>>
    >>> http://www.isi.edu/licensed-sw/rewrite-decoder/
    >>>
    >>>
    >>> It simple, but you will do not know how it works..
    >>>
    >>>
    >>>
    >>> --
    >>> Sergey Protasov
    >>> PhD student in Computational Linguistics,
    >>> Moscow Institute of Physics and Technology
    >>>
    >>>
    >>>
    >>>
    >>> Eric Atwell wrote:
    >>>
    >>>> Sergey,
    >>>> do you have any evaluation report or other evidence of how good
    >>>> this OpenSource MT system is? Bogdan Babych, researcher here at
    >>>> Leeds,
    >>>> is thinking of developing a demo MT system for research and
    >>>> teaching,
    >>>> but it may be worth considering adapting an existing oepn-source
    >>>> system
    >>>> regards
    >>>> Eric Atwell
    >>>
    >>>
    >>>
    >>>
    >
    >



    This archive was generated by hypermail 2b29 : Sun Nov 21 2004 - 15:39:59 MET