Corpora: Need for texts to evaluate named entity recognition software in En, Fr, De and Es

From: Ralf Steinberger (
Date: Mon Mar 18 2002 - 18:14:46 MET

  • Next message: alejandro curado: "Corpora: indicators"


    we are looking for texts containing many named entities such as peoples'
    names, company names, names of organisations/authorities and geographical
    places in the languages English, French, German and Spanish.

    The texts will be used for the evaluation of named entity recognition
    software. Parallel texts (texts and their translations) would be preferred
    as they would make the evaluation easier. It is not strictly necessary that
    the named entities be marked up in the text.

    The evaluation will be carried out by a student, who is writing her Master's
    thesis on this subject, in collaboration with the EC's Joint Research
    Centre. The thesis will be made publicly available.

    Any hints are welcome. Thanks in advance.

    Ralf Steinberger (
    European Commission, Joint Research Centre (
    Institute for the Protection and Security of the Citizen (IPSC)

    This archive was generated by hypermail 2b29 : Mon Mar 18 2002 - 17:10:44 MET