Corpora: MT Evaluation: thanks

Derek Lewis (D.R.Lewis@exeter.ac.uk)
Mon, 24 Nov 1997 09:06:21 +0000 (GMT Standard Time)

Some weeks ago I asked for sources of information on
evaluation of Machine Translation.
The response was overwhelming, and I thank all of those who
helped. Below is a list of most of the items. I have not
been able to follow them all up yet, and some books/reports
are hard to get, but I hope the results are useful (I
have not included inactive web sites).
Thanks again.

-------------
http://www.ltg.ed.ac.uk/helpdesk/faq

http://www.ims.uni-stuttgart.de/info/MTTools.html

http://www.ilc.pi.cnr.it/EAGLES96/browse.html#wg3

Doug Arnold et al on MT (includes a chapter on evaluation)
http://clwww.essex.ac.uk/~doug/book/node74.html

http://issco-www.unige.ch/ewg95/ewg95.html

See relevant section in K Sparck Jones and J.R. Galliers,
Evaluating natural language processing systems, Springer
Lecture Notes in AI 1083,
1996

Falkedal, K. (1994a). Evaluation methods for machine
translation systems: An historical overview and critical
account, Issco draft report, University of Geneva, Geneva.

Falkedal, K. (ed.) (1994b). Proceedings of the evaluators'
forum, Les Rasses, ISSCO, University of Geneva, Geneva.

JEIDA (1992). JEIDA methodology and criteria on machine
translation evaluation, JEIDA, Tokyo.

King, M. and Falkedal, K. (1990). Using test suites in the
evaluation of machine translation systems, Proceedings of
COLING-90, ACL,
Helsinki, pp. 211-219.

MT (1994). on evaluation, Machine Translation 8(1.2).
Edited by D. Arnold, R.L. Humphreys and L. Sadler.

O'Connell, T., O'Mara, F. and White, J. ( 1994). The ARPA
MT evaluation methodologies: Evolution, lessons and further
approaches, Proceedings of the First Conference of the
Association for Machine Translation in the Americas,
Columbia, U.S.A.

Volk M.: Probing the Lexicon in Evaluating Commercial MT
Systems, in Cohen P.R. & Wahlster W.(eds.), Proc. 35th
Annual Meeting of the Assoc iation forComputational
Linguistics and of the 8th Conference of the European
Chapter of the Association for Computational Linguistics,
July 7-12, Madrid, Morgan Kaufmann, San Francisco, CA,
pp.112-119, 1997.

Nyberg E.H., Mitamura T., Carbonell J.G.: Evaluation
Metrics for Knowledge- Based Machine Translation, in
Proceedings of the 15th InternationalConference on
Computational Linguistics, Kyoto, Japan, pp.95-99, 1994.

Visser E.M., Fuji M.: Missing Sentence Connectors for
Evaluating MT output, in Proceedings of the 16th
International Conference on Computational
Linguistics, August 5-9, Copenhagen, Denmark, Center for
Sprogteknologi, Copenhagen, Denmark, pp.1066-1069, 1996.

Slocum J., Bennett W.S., Whiffin L., Norcross E.: An
Evaluation of METAL: the LRC Machine Translation System,
in 2nd Conference of the European
Chapter of the Association for Computational Linguistics,
Association for Computational Linguistics, 1985.

Dauphin E., Lux V.: Corpus-Based annotated Test Set for
Machine Translation Evaluation by an Industrial User, in
Proceedings of the 16th International
Conference on Computational Linguistics, August 5-9,
Copenhagen, Denmark, Center for Sprogteknologi,
Copenhagen, Denmark, pp.1061-1065, 1996.

King M., Falkedal K.: Using Test Suites in Evaluation of
Machine Translation Systems, in Karlgren H.(ed.),
Proceedings of the 13th International
Conference on Computational Linguistics, University of
Helsinki, Finland, 211-216, 1990.

Carter D., Becket R., Rayner M., Eklund R., MacDermid C.,
Wiren M., Kirchmeier-Andersen S., Philp C.: Translation
Methodology in the Spoken Language Translator: An
Evaluation, in Krauwer S., et al.(eds.), Spoken Language
Translation: Proceedings of a Workshop Sponsored by the
Association of Computational Linguistics and by the
European Network in Language and Speech (ELSNET),
Association for Computational Linguistics, Somerset, NJ,
pp.73-82, 1997.

Levine J., Mellish C.: The IDAS User Trials: Quantitative
Evaluation of an Applied Natural Language Generation
System, in Proceedings of the Fifth European Workshop on
Natural Language Generation, Leiden, the Netherlands,
pp.75-94, 1995.

Dale R.: Evaluating Natural Language Generation Systems,
in Thompson H.S.(ed.), The Strategic Role of Evaluation in
Natural Language Processing and Speech Technology,
University of Edinburgh, UK, Human Communication Research
Centre, 1992.

Pattabhiraman T., Cercone N.: Evaluating Natural Language
Generation Systems for Theoretical Merit: A Position
Statement, in Meteer M.(ed.), AAAI Workshop on Evaluating
Natural Language Generation Systems - Workshop Notes,
Cambridge, MA, 1990.

Coch J.: Evaluating and comparing three Text-production
Techniques, in Proceedings of the 16th International
Conference on Computational Linguistics, August 5-9,
Copenhagen, Denmark, Center for Sprogteknologi,
Copenhagen, Denmark, pp.249-254, 1996.

Meteer M.(ed.): AAAI Workshop on Evaluating Natural
Language Generation Systems - Workshop Notes, Cambridge,
MA, 1990.

Yorick Wilks, ‘SYSTRAN: it obviously works but how much can
it be improved?’ in Newton (ed.) Computers in Translation,
Routledge

The European Assocation of Machine Translation (EAMT)
offers a bibliographic service to all its
members. This is perhaps the most comprehensive and handy
source.

-------------

Derek Lewis
German Department/Foreign Language Centre
Queens Building
University of Exeter
United Kingdom
EX4 4QH
Tel. 01392 264330
Fax. 01392 264339