[Corpora-List] corpus alignment competition

From: Piao, Songlin (s.piao@lancaster.ac.uk)
Date: Mon Jun 07 2004 - 12:11:41 MET DST

  • Next message: Chris Brew: "Re: [Corpora-List] corpus alignment competition"

    Hi,

    We are planning a project in Lancaster, UK, aiming to expand and align the EMILLE multilingual parallel corpora. If funded the project will result in a parallel corpus of eleven languages (Arabic, Bangla, Chinese, English, Gujarati, Hindi, Panjabi, Polish, Somali, Urdu and Vietnamese). When we have expanded the corpus, we want to hold an alignment competition on this data. Because the languages involved include a wide range of typologically different/distant languages, the corpus should present a tough challenge to current alignment algorithms, and hence provide an excellent opportunity to test the ability of current alignment algorithms/tools on a wide range of languages.

    At this stage we are asking for expressions of interest in taking part in the competition. We need these expressions of interest at this stage because we intend to include a small amount of money in the project budget for each competing team so that they can hire native speakers etc. to help tune their algorithms in advance of the competition proper. If you are interested in taking part in this alignment competition, please let us know by contacting Dr. Scott Piao (s.piao@lancaster.ac.uk) in the first instance.

    Thank you,

    Paul Baker, Tony McEnery & Scott Piao
    -------------------------------------
    Dept. of Linguistics and MEL
    Lancaster University
    Lancaster LA1 4YT
    United Kingdom



    This archive was generated by hypermail 2b29 : Mon Jun 07 2004 - 12:49:30 MET DST