Re: [Corpora-List] Interlingual Machine Translation Systems (fwd)

From: Gilles Serasset (Gilles.Serasset@imag.fr)
Date: Sun Nov 21 2004 - 18:13:49 MET

  • Next message: Mcenery, Tony: "RE: [Corpora-List] Interlingual Machine Translation Systems (fwd)"

    Sorry Serguei, but your mail is based on naive ideas and false
    assumptions.

    On 21 nov. 04, at 10:53, Sergey Protasov wrote:
    > We should define the term "good" of MT systems.
    >
    > If we take arbitraty sentences from some very big not specialized
    > english corpus and translate it, using expert-man-translator, we have
    > about 80-90% correctly translated sentences.
    > Let's define this as the best quality of translation.

    Which should mean that current measures (BLEU, ORANGE,...) should rank
    these as top "systems". Which, apparently, is not the case.

    > So "good" translation is about 45-50% of correct sentences.

    THIS is naive, 100% of incorrect, but "understandable" sentences is
    better than 50% of totally unintelligible sentences (especially if it
    is the 50% sentences that are more than 7 or 8 words long...).
    Moreover, this does not take into account the purpose of the system.
    For example, SYSTRAN will be considered as a very bad system for the
    translation of meteorological bulletin, where METEO will be considered
    VERY GOOD (with your definition...). However, METEO will never be
    considered as a good system for wide coverage application, where
    SYSTRAN will be considered good.

    Also, we should distinguish usage, coverage, quality and potential (the
    amount of effort that is needed to raise one of the criteria).

    > I think, Systran and any other MT system can translate correctly not
    > more than one percent of sentences, arbitrary selected from big
    > corpus.

    Well, even if it was the case (which I doubt if such evaluation is done
    on a fair basis), SYSTRAN will still be useful. The proof being that,
    well, it IS used by many.

    > This is not "good" in any case, IMHO.
    >
    Well, 2 months ago, I was going to Japan and wanted to know the
    directions to Okayama University. The "how to get there" was only
    available in Japanese... Hence, I asked Systran to translate it into
    english. I'm sure that the english was bad, but well, I don't read
    Japanese, and English is not my mother tongue, but still, I managed to
    get where I wanted to go.

    This is not "bad" in any case, IMHO.

    If you want to have a look at Russian

    Finally, if you are speaking about statistical MT, forget what I said,
    as I don't know ANY statistical MT system that is used daily.

    --
    Gilles Sérasset
    GETA-CLIPS-IMAG (UJF, INPG & CNRS)
    BP 53 - F-38041 Grenoble Cedex 9
    Phone: +33 4 76 51 43 80
    Fax:   +33 4 76 44 66 75
    



    This archive was generated by hypermail 2b29 : Sun Nov 21 2004 - 18:16:35 MET