Re: [Corpora-List] Interlingual Machine Translation Systems (fwd)

From: Sergey Protasov (svp@zuzino.net.ru)
Date: Sun Nov 21 2004 - 10:53:18 MET

  • Next message: Yorick WIlks: "Re: [Corpora-List] Interlingual Machine Translation Systems (fwd)"

    Ok, Yorick,

    We should define the term "good" of MT systems.

    If we take arbitraty sentences from some very big not specialized
    english corpus and translate it, using expert-man-translator, we have
    about 80-90% correctly translated sentences.
    Let's define this as the best quality of translation.

    So "good" translation is about 45-50% of correct sentences.

    And satisfactory - about 20% of correct sentences. (logarithm scale)

    I think, Systran and any other MT system can translate correctly not
    more than one percent of sentences, arbitrary selected from big corpus.

    This is not "good" in any case, IMHO.

    If I wrong about 1% - let me know please.

    Im sorry for my bad english.

    Yorick WIlks wrote:
    > This reply from Russia is total nonsense, unless "good" means something
    > utterly impractical. There are many evaluated MT systems that do a
    > reasonable job (i.e. giving a good indication of what a document says)
    > and some are available free on search sites as well all know. The
    > world's oldest and strongest system SYSTRAN sometimes does a very good
    > job. recommending a 20 word MT system shows utter ignorance of the last
    > forty years.
    > Yorick Wilks
    >
    >
    > On Friday, November 19, 2004, at 12:18 PM, Sergey Protasov wrote:
    >
    >>
    >> Eric,
    >>
    >> There are no good MT systems today at all.
    >> So there are no good opensource MT systems today.
    >>
    >> ThoughtTreasure is very big system for teaching and it have bad syntax
    >> parser. (It fails, if senstence have more that 7-10 words)
    >>
    >> I recommend you to see link grammar translator for teaching.
    >> http://www.link.cs.cmu.edu/link/submit-to-translator.html
    >> It show very good translations, but It have 20 words in vocab only..
    >>
    >> You can add more words... But it is not trivial...
    >>
    >> If you intresting in statistical mashine translation, forget I said
    >> before and go to here
    >>
    >> http://www.isi.edu/licensed-sw/rewrite-decoder/
    >>
    >>
    >> It simple, but you will do not know how it works..
    >>
    >>
    >>
    >> --
    >> Sergey Protasov
    >> PhD student in Computational Linguistics,
    >> Moscow Institute of Physics and Technology
    >>
    >>
    >>
    >>
    >> Eric Atwell wrote:
    >>
    >>> Sergey,
    >>> do you have any evaluation report or other evidence of how good this
    >>> OpenSource MT system is? Bogdan Babych, researcher here at Leeds,
    >>> is thinking of developing a demo MT system for research and teaching,
    >>> but it may be worth considering adapting an existing oepn-source system
    >>> regards
    >>> Eric Atwell
    >>
    >>
    >>
    >>
    >
    >



    This archive was generated by hypermail 2b29 : Sun Nov 21 2004 - 10:55:58 MET