[Corpora-List] Summary: Parallel texts for MT evaluation

From: D Elliott (debe@comp.leeds.ac.uk)
Date: Fri Jun 13 2003 - 14:00:20 MET DST

  • Next message: delucca@nilc.icmc.usp.br: "[Corpora-List] Legal aspects of compiling corpora"

    Dear all,

    Thanks to everyone who responded to my request for parallel texts with
    good quality human translations, suitable for my MT evaluation research.

    Here is a summary of resources available from the web:

    INTERSECT corpus
    FRENCH-ENGLISH:
    Le Monde, instructions for domestic appliances, technical and academic
    texts and others
    GERMAN-ENGLISH
    Company home pages, news items, EU documents and more
    http://www.brighton.ac.uk/edusport/languages/html/intersect.html
    Thanks to Professor Raphael Salkie, University of Brighton, UK

    Proceedings of the European Parliament
    MANY EUROPEAN LANGUAGES INTO ENGLISH
    http://www.isi.edu/~koehn/publications/europarl/
    Thanks to Susana Sotelo Docío, Universidade de Santiago de Compostela

    OPUS corpus
    ENGLISH SOURCE TEXTS translated into French, Spanish, Swedish, German, and
    Japanese.
    Jörg Tiedemann and Lars Nygaaard compiled the documentation of the office
    package OpenOffice[1] and the PHP[2] manual. The resulting corpus is OPUS
    - an open source parallel corpus.
    http://logos.uio.no/opus/
    [1] http://www.openoffice.org
    [2] http://www.php.net
    Thanks to Susana Sotelo Docío, Universidade de Santiago de Compostela

    UN declarations of human rights
    Many languages
    http://www.unhchr.ch/udhr/index.htm
    Thanks to Paul McNamee, Johns Hopkins University and Ella Earp-Lynch,
    SpeechWorks International

    Centre for Disease Control (USA)
    Chinese, French, Japanese, Spanish info on SARS and many other medical
    topics
    http://www.cdc.gov/
    http://www.cdc.gov/ncidod/sars/languages.htm
    Thanks to Paul McNamee, Johns Hopkins University

    Debian free software community:
    Technical translations
    http://www.debian.org/international/
    Thanks to Paul McNamee, Johns Hopkins University

    Official journal of the EU
    Freely downloadable European legislation in many languages
    http://europa.eu.int
    Thanks to Paul McNamee, Johns Hopkins University, Terence Lewis (Language
    Engineer) and Koen.Kerremans

    Public registry of the Council of the EU
    PDF files in various languages. Translations indicate the source
    language.
    http://register.consilium.eu.int/
    Thanks to John Beaven

    COMPARA corpus
    English-Portuguese/Portuguese-English
    http://www.linguateca.pt/COMPARA/
    Thanks to Dr Ana Frankenberg-Garcia,Instituto Superior de Línguas e
    Administração, Lisboa, Portugal

    The Universal Declaration of Human Rights
    UNESCO's website also has most
    documents available translated into Spanish, French and frequently into
    Russian, Chinese and Arabic

    French Foreign Ministry's magazine - Label France:
    French into various languages
    http://www.france.diplomatie.fr/label_france/index.html
    Thanks to Jeremy Whistle, University College Northampton

    ELRA newsletter
    In French and English
    www.elda.fr
    Thanks to Jeff Allen

    Multilingual articles:
    English version:
    http://www.multilingual.com/allen51.htm
    French translation:
    http://www.editionscle.com/bol/presse/article1/allen-mltc51-fr.htm
    English version: http://www.multilingual.com/allen53.htm
    French translation:
    http://www.editionscle.com/bol/presse/article2/allen-mltc53-fr.htm
    Thanks to Jeff Allen

    Haitian Creole version:
    http://hometown.aol.com/mit2haiti/JA-HC-kr.htm
    English version:
    http://hometown.aol.com/mit2haiti/JA-HC-eng.htm
    Thanks to Jeff Allen

    MIT2 website
    Marilyn Mason Bio & Publication List:
    http://hometown.aol.com/marilinc/Index3.html
    Creole Links Page:
    http://hometown.aol.com/mit2haiti/Index4.html
    The Creole Clearinghouse:
    http://hometown.aol.com/CreoleCH/Index6.html
    Thanks to Jeff Allen

    -- 
    ***************************************************
    Debbie Elliott
    Computer Vision and Language Research Group,
    School of Computing,
    University of Leeds,
    Leeds LS2 9JT
    United Kingdom.
    Tel: 0113 3436818
    Email: debe@comp.leeds.ac.uk
    Website (to be expanded):
    http://www.comp.leeds.ac.uk/cgi-bin/sis/ext/rs_pub.cgi/debe.html?cmd=displayrs
    ***************************************************
    



    This archive was generated by hypermail 2b29 : Fri Jun 13 2003 - 14:05:15 MET DST