RE: Corpora: German treebank sampler and Portuguese treebank

From: Santos Diana (Diana.Santos@informatics.sintef.no)
Date: Thu Sep 13 2001 - 13:32:08 MET DST

  • Next message: Khalid CHOUKRI: "Re: Corpora: Digit Speech Corpora in Chinese and English"

    Dear TIGER team,

    Thank you for the wealth of information you put on-line in connection with
    the TIGER project.

    In this connection, we would also like to inform you that there is ongoing a
    project for the creation of a treebank for Portuguese, the Floresta
    Sintá(c)tica project,
    http://cgi.portugues.mct.pt/treebank/PaginaFloresta.html (a joint project
    having as partners VISL http://visl.sdu.dk/ and the Computational Processing
    of Portuguese project http://www.portugues.mct.pt/).

    From our pages, we also make available a sampler and documentation on
    general and specific linguistic options taken. We have also developed a
    special querying tool for the syntactically annotated trees, on top of the
    IMS Corpus Workbench developed by the IMS at the University of Stuttgart.
    This is work in progress, but can be tested at
    http://cgi.portugues.mct.pt/treebank/ProcuraArvores.html

    I should also recall that the Computational Processing of Portuguese also
    gives access to several millions of (automatically) syntactically annotated
    words through the AC/DC project (again in joint work with VISL), at
    http://cgi.portugues.mct.pt/acesso.
     
    We would therefore be grateful if you updated your "related links" page
    accordingly.

    I use this opportunity to also inform the corpora community at large,
    although most of the Web pages referred to are (so far) only in Portuguese.

    Best greetings,
    Diana (for the Floresta and AC/DC teams)
    ************************************************************************
    Diana Santos Computational processing of Portuguese

    SINTEF Telecom & Informatics Tel. (direct line) +47 22 06 73 12
    Forskningsveien 1 Tel. +47 22 06 73 00
    Box 124 Blindern Fax. +47 22 06 73 50
    N-0314 Oslo Email:
    Diana.Santos@informatics.sintef.no
    Norway http://www.portugues.mct.pt/
    ************************************************************************

    > -----Original Message-----
    > From: TIGER corpus team [mailto:tigercorpus@ims.uni-stuttgart.de]
    > Sent: 12. september 2001 10:46
    > To: corpora@hd.uib.no
    > Subject: Corpora: German treebank sampler
    >
    >
    >
    > The TIGER German treebank sampler has been released!
    > ----------------------------------------------------
    >
    > A large syntactically annotated corpus of German newspaper text
    > is under construction in the TIGER project - with project partners
    > in Saarbruecken, Potsdam, and Stuttgart.
    >
    > In order to get feedback from the research community, the
    > TIGER project team
    > has released a sampler of the TIGER corpus:
    >
    > http://www.ims.uni-stuttgart.de/projekte/TIGER/
    >
    > The TIGER corpus is annotated with 'syntax graphs', a
    > generalization of
    > syntax trees, in order to be able to account for phenomena involving
    > discontinuous constituents. E.g.
    > - long distance dependencies are encoded by crossing edges
    > - coreference in coordination is represented by 'secondary edges'
    > More details of the annotation scheme are available online,
    > where you can
    > also explore the TIGER corpus sampler interactively.
    >
    > ---
    > The TIGER project team.
    > Department of Computational Linguistics, Saarland University
    > Institut fuer Germanistik, University of Potsdam
    > Department of Natural Language Processing (IMS), University
    > of Stuttgart
    > email: tigercorpus@ims.uni-stuttgart.de
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Thu Sep 13 2001 - 13:29:22 MET DST