Re: [Corpora-List] Interlingual Machine Translation Systems (fwd)

From: Sergey Protasov (svp@zuzino.net.ru)
Date: Sun Nov 21 2004 - 17:32:32 MET

  • Next message: Gilles Serasset: "Re: [Corpora-List] Interlingual Machine Translation Systems (fwd)"

    I did evaluations of PROMT MT system, that define yourself as one of the
    best MT system of the world. http://translate.ru

    I select arbitrary sentenceses from big russian corpus (Moshkov library,
    lib.ru) and translate it from russian to english and then translate
    result from english to russian.
    Then I evaluate the meaning diffrence between two russian sentences.
    If the meaning diffrence is big enough - translation of sentence is
    incorrect.

    And accourding to this test, there is no more than 1% correct translations.

    By the way - PROMT MT system on special texts can give 80-90% "
    acceptable sentences...

    So we have:
    For any MT system there is very small special field in that this MT
    system is very cool...

    And in this case, all MT system are exellent MT system (for very small
    subject)
    So there is no "bad" MT system at all...

    So, if somebody have other definiton of quality of MT system - give it
    to me please...

    -- 
    Sergey Protasov
    PhD student in Computational Linguistics,
    Moscow Institute of Physics and Technology
    

    Yorick WIlks wrote: > I did evaluations of SYSTRAN for the US AirForce back in 1980 or so and > it was getting then about 70-80 % of Russian sentences acceptably > correct into English and about 60% of other languages it did. SYSTRAN > has now been working for the European Commission at Luxemburg > translating between English/French/German for about 25 years and does > millions of words a week-they could not possibly function without it. I > am afraid your 1% guess has no relation to the facts at all---look at > the Commission's website. > YW > > > On Sunday, November 21, 2004, at 09:53 AM, Sergey Protasov wrote: > >> Ok, Yorick, >> >> We should define the term "good" of MT systems. >> >> If we take arbitraty sentences from some very big not specialized >> english corpus and translate it, using expert-man-translator, we have >> about 80-90% correctly translated sentences. >> Let's define this as the best quality of translation. >> >> So "good" translation is about 45-50% of correct sentences. >> >> And satisfactory - about 20% of correct sentences. (logarithm scale) >> >> I think, Systran and any other MT system can translate correctly not >> more than one percent of sentences, arbitrary selected from big > corpus. >> >> This is not "good" in any case, IMHO. >> >> If I wrong about 1% - let me know please. >> >> Im sorry for my bad english. >> >> >> Yorick WIlks wrote: >> >>> This reply from Russia is total nonsense, unless "good" means >>> something utterly impractical. There are many evaluated MT systems >>> that do a reasonable job (i.e. giving a good indication of what a >>> document says) and some are available free on search sites as well >>> all know. The world's oldest and strongest system SYSTRAN sometimes >>> does a very good job. recommending a 20 word MT system shows utter >>> ignorance of the last forty years. >>> Yorick Wilks >>> On Friday, November 19, 2004, at 12:18 PM, Sergey Protasov wrote: >>> >>>> >>>> Eric, >>>> >>>> There are no good MT systems today at all. >>>> So there are no good opensource MT systems today. >>>> >>>> ThoughtTreasure is very big system for teaching and it have bad >>>> syntax parser. (It fails, if senstence have more that 7-10 words) >>>> >>>> I recommend you to see link grammar translator for teaching. >>>> http://www.link.cs.cmu.edu/link/submit-to-translator.html >>>> It show very good translations, but It have 20 words in vocab only.. >>>> >>>> You can add more words... But it is not trivial... >>>> >>>> If you intresting in statistical mashine translation, forget I said >>>> before and go to here >>>> >>>> http://www.isi.edu/licensed-sw/rewrite-decoder/ >>>> >>>> >>>> It simple, but you will do not know how it works.. >>>> >>>> >>>> >>>> -- >>>> Sergey Protasov >>>> PhD student in Computational Linguistics, >>>> Moscow Institute of Physics and Technology >>>> >>>> >>>> >>>> >>>> Eric Atwell wrote: >>>> >>>>> Sergey, >>>>> do you have any evaluation report or other evidence of how good >>>>> this OpenSource MT system is? Bogdan Babych, researcher here at >>>>> Leeds, >>>>> is thinking of developing a demo MT system for research and teaching, >>>>> but it may be worth considering adapting an existing oepn-source >>>>> system >>>>> regards >>>>> Eric Atwell >>>> >>>> >>>> >>>> >>>> >> >> > >



    This archive was generated by hypermail 2b29 : Sun Nov 21 2004 - 17:42:02 MET