Corpora: New Corpora from LDC

From: LDC Office (ldc@unagi.cis.upenn.edu)
Date: Mon Jun 19 2000 - 23:18:30 MET DST

  • Next message: Tadeusz Piotrowski: "Odp: Corpora: Corpus of Polish"

    The Linguistic Data Consortium is pleased to announce 2 new
    corpora.

    HONG KONG LAWS PARALLEL TEXT
    http://morph.ldc.upenn.edu/Catalog/LDC2000T47.html

    This corpus was collected during January 1999 from
    http://www.justice.gov.hk, the bilingual website of the Department
    of Justice of the Hong Kong Special Administrative Region (HKSAR)
    of the People's Republic of China. The corpus, available from the
    LDC via FTP, consists of 313,659 parallel sentences in Chinese and
    English, which have been processed and sentence aligned.

    HONG KONG NEWS PARALLEL TEXT
    http://morph.ldc.upenn.edu/Catalog/LDC2000T46.html

    This FTP publication was created when the LDC collected parallel
    Chinese-English news articles from the Information Services
    Department of Hong Kong Special Administrative Region (HKSAR) of
    the People's Republic of China. The collection contains 18,147
    aligned article pairs released by HKSAR from 1 July 1997 through
    30 April 2000. Automatic article alignment was done at the LDC.

    Because of restrictions imposed by the copyright holders, these
    corpora are available to 2000 LDC members only. If you would like
    to order a copy of these corpora, please email your request to
    <ldc@unagi.cis.upenn.edu>. If you need additional information
    before placing your order, or would like to inquire about
    membership in the LDC, please send email or call (215) 573-1275.

    Further information about the LDC and its available corpora can be
    accessed on the Linguistic Data Consortium WWW Home Page at URL:
    http://www.ldc.upenn.edu/



    This archive was generated by hypermail 2b29 : Mon Jun 19 2000 - 23:18:32 MET DST