[Corpora-List] Call for Participation: Time Expression and Recognition (TERN) Evaluation

From: Lisa Ferro (lferro@mitre.org)
Date: Thu May 20 2004 - 16:08:46 MET DST

  • Next message: eneko agirre: "[Corpora-List] wordlist of derived words"

    (Apologies to those who receive more than one copy)

    **********

    CALL FOR PARTICIPATION: Time Expression Recognition and Normalization
    (TERN) Evaluation, April-September 2004

    Sponsored by the Automatic Content Extraction (ACE) program

    **********

    INTRODUCTION

    The objective of the Automatic Content Extraction (ACE) program is to
    develop natural language processing technology to support automatic
    understanding of textual data. This includes classification, filtering,
    and selection based on the meaning conveyed by the data. Thus, the ACE
    program requires the development of technologies that automatically
    detect and characterize this meaning.

    The Time Expression Recognition and Normalization (TERN) evaluation is
    based on work that began in 1999 to establish a set of useful guidelines
    for text annotation and data interchange. The guidelines define a tag
    called TIMEX2, including attributes for expressing the normalized,
    intended meaning or value of a broad range of temporal expressions. The
    work extends the Message Understanding Conferences' definition of the
    TIMEX category of named entity to include a broader variety of
    expressions and to offer a normalization scheme.

    TIMEX2 is influencing the definition of ACE tasks, in which temporal
    expressions covered by TIMEX2 will contribute to filling temporal
    attributes for extracted relations and events. Thus, the production of
    TIMEX2 annotations is viewed as an ACE component technology. The TERN
    evaluation is open to sites that want to develop this type of component
    technology. The evaluation will be offered in both English and Chinese.

    TASK DEFINITION

    The TIMEX2 task requires that temporal expressions mentioned in the
    source data be detected and normalized according to the "2003 Standard
    for the Annotation of Temporal Expressions" by Ferro et al, as updated
    and posted on the project website (http://timex2.mitre.org). Guidelines
    that are particular to Chinese are documented (with extensive examples)
    in a separate supplement.

    Temporal expressions to be marked include both absolute expressions
    ("July 17, 1999", "12:00", "the summer of '69") and relative expressions
    ("yesterday," "last week," "the next millennium"). Also markable are
    durations ("one-hour", "two weeks"), event-anchored expressions ("two
    days before departure"), and sets of times ("every week"). The degree
    to which these expressions can be normalized given the current TIMEX2
    guidelines varies according to the type and specificity of the
    expression.

    DATA

    Annotated training and test data are being prepared by the MITRE
    Corporation, under the supervision of the SPAWAR Systems Center. The
    text corpora to be used for evaluation are drawn from those selected for
    the basic ACE 2004 evaluation tasks (Entity Detection and Tracking,
    Relation Detection and Characterization). Both training and test sets
    are drawn from broadcast news and newswire sources. The Linguistic Data
    Consortium (LDC) is managing the distribution of the training materials
    to TERN participants.

    EVALUATION

    Scores will be reported in terms of precision, recall, and F-measure, as
    well as in error-based terms of undergeneration, overgeneration,
    substitution and overall error. The National Institute of Standards and
    Technology (NIST) is responsible for administering the TERN evaluation,
    using scoring software prepared by the MITRE Corporation.

    Three aspects of TIMEX2 performance will be measured:
         * Detection (correct/missing/spurious): whether a markable
    expression is detected and given a TIMEX2 tag
         * Text (correct/incorrect): the byte offsets of the markable
    expression (extent)
         * Attributes (correct/incorrect/missing/spurious): the values
    assigned to each of the attributes (VAL, MOD, ANCHOR_DIR, ANCHOR_VAL,
    SET) within the
    TIMEX2 tag

    Sites that are not prepared to undertake the Attributes (normalization)
    portion of the task may elect to be evaluated only on the Detection and
    Text aspects.

    SCHEDULE

    Now: Interested sites may obtain evaluation information from
          http://timex2.mitre.org and
          http://www.nist.gov/speech/tests/ace/ace04/index.htm
    April 12-July 1: Increments of training data released by LDC to sites
    that respond to this CFP (see below)
    June 30: Last day to register as evaluation participant (see below)
    August 2-13: Evaluation text corpus (English/Chinese) available to
    participants. Participants must return results (system output) to NIST
    within 24
    hours of receipt of the evaluation corpus.
    August 13: Last day for participants to submit official results to NIST
    September (date TBD): Evaluation scores released by NIST to
    participants
    23 September: One-day meeting in conjunction with ACE workshop

    HOW TO RESPOND TO THIS CFP

    Organizations that are considering participation in the TERN evaluation
    should do the following at this time:
         1. Send the information requested below by email to
    Mark.Przybocki@Nist.Gov:
            Organization name:
            Contact name:
            Contact email address:
            Contact telephone number:
            Shipping Address (including street address and main telephone
    number):
    By providing this information, your organization will become eligible to
    receive the training and evaluation corpora from the LDC.

         2. Subscribe to the ACE_TERN mailing list by following the
    instructions at NIST's ACE website at
    http://www.nist.gov/speech/tests/ace/ace04/index.htm.

    In order to register your organization as an official participant in the
    TERN evaluation and to receive an invitation to attend the TERN
    workshop, please visit the above NIST website for specific registration
    information. You will need to submit the completed TERN evaluation
    registration form, which is available for download from that site, prior
    to June 30, as stated in the registration form.



    This archive was generated by hypermail 2b29 : Thu May 20 2004 - 17:21:59 MET DST