Corpora: New LDC Corpora

From: LDC Office (ldc@ldc.upenn.edu)
Date: Fri Feb 22 2002 - 17:54:36 MET

  • Next message: ide@cs.vassar.edu: "Corpora: Second CFP: Semantic Web Meets Language Resources"

                     * RST Discourse Treebank *

            * Multiple-Translation Chinese Corpus *

    The Linguistic Data Consortium (LDC) is pleased to announce the
    availability of the RST Discourse Treebank. This ftp publication
    has been authored by Lynn Carlson, Daniel Marcu, and Mary Ellen
    Okurowski. It contains a selection of 385 Wall Street Journal articles
    from the Penn
    Treebank which have been annotated with discourse structure in the
    framework of Rhetorical Structure Theory (RST). Additionally, the
    corpus includes a number of human generated extracts and abstracts
    associated with the original documents.
     

    For further information, including a link to the discourse annotation
    tool used for this database, please visit:

    http://www.ldc.upenn.edu/Catalog/LDC2002T07.html

    Institutions that have membership in the LDC during the 2002
    Membership Year will be able to receive this corpus free of charge.
    Nonmembers may purchase this publication for $100.

                                *

    The Linguistic Data Consortium (LDC) would like to announce the
    availability of the Multiple-Translation Chinese Corpus. This ftp
    publication was designed to support the development of automatic means
    for evaluating translation quality. The corpus consists of 105 stories
    drawn from Mandarin Chinese journalistic text. These stories were
    translated several times into English by both human translators and MT
    systems.

    For further information, including a Chinese text with a sample English
    translation, please visit:

    http://www.ldc.upenn.edu/Catalog/LDC2002T01.html

    Institutions that have membership in the LDC during the 2002
    Membership Year will be able to receive this corpus free of charge.
    Nonmembers may purchase this publication for $400.

                               *

    If you need additional information before placing your order, or
    would like to inquire about membership in the LDC, please send email to
    <ldc@ldc.upenn.edu> or call (215) 573-1275.

    --------------------------------------------------------------------
    Linguistic Data Consortium Phone: (215) 573-1275
    3615 Market Street Fax: (215) 573-2175
    Suite 200 email: ldc@unagi.cis.upenn.edu
    Philadelphia, PA 19104-2608 www: http://www.ldc.upenn.edu



    This archive was generated by hypermail 2b29 : Fri Feb 22 2002 - 17:50:44 MET