Corpora: ELRA News

From: Valerie Mapelli (mapelli@elda.fr)
Date: Tue Feb 06 2001 - 11:35:24 MET

  • Next message: Christian Jacquemin: "Corpora: T.A.L. JOURNAL: SPECIAL ISSUE IN IR-ORIENTED NLP"

    [ We apologise for the duplicate posting of this announcement ]
    ___________________________________________________________
                                    ELRA
                    European Language Resources Association
                                   ELRA News
    ___________________________________________________________

                         *** ELRA NEW RESOURCES ***

    We are happy to announce new resources available via ELRA:

    - Telephone Speech Resources
    ELRA-S0094 Czech SpeechDat(E) Database
    ELRA-S0095 Slovak SpeechDat(E) Database
    ELRA-S0096 German SpeechDat(II) MDB-1000 Database

    - Written Corpus
    ELRA-W0026 PAROLE Irish Corpus

    A short description of each database is given below.
    Detailed information can be found on the following Website:
    http://www.elda.fr/catalog.html

    _______________________________________
    TELEPHONE SPEECH RESOURCES
    _______________________________________
    - ELRA-S0094 Czech SpeechDat(E) Database
    This database comprises 1052 Czech speakers (526 males,
    526 females) recorded over the Czech fixed telephone network.
    - ELRA-S0095 Slovak SpeechDat(E) Database
    This database comprises 1000 Slovak speakers (498 males,
    502 females) recorded over the Slovak fixed telephone network.
    - ELRA-S0096 German SpeechDat(II) MDB-1000
    This database comprises 1295 German speakers (663 males,
    610 females, 22 speakers with gender not specified) recorded
    over the German mobile telephone network.
    _______________________________________
    WRITTEN CORPUS
    _______________________________________
    ELRA-W0026 PAROLE Irish Corpus
    This corpus consists of over 8 million words The text is
    marked-up in accordance with the PAROLE encoding standard.
    All the files are in SGML format with a detailed header and the
    body of the text tagged to paragraph level. A subset of the corpus
    is morpho-syntactically tagged. Included in this distribution is
    approximately 3,000 manually checked words.

    =====================================
    For further information, please contact:

          ELRA/ELDA Tel +33 01 43 13 33 33
          55-57 rue Brillat-Savarin Fax +33 01 43 13 33 30
          F-75013 Paris, France E-mail mapelli@elda.fr

    or visit our Web site:

          http//www.icp.grenet.fr/ELRA/home.html
          or http//www.elda.fr
    =====================================



    This archive was generated by hypermail 2b29 : Tue Feb 06 2001 - 11:34:48 MET