Corpora: German treebank sampler

From: TIGER corpus team (tigercorpus@ims.uni-stuttgart.de)
Date: Wed Sep 12 2001 - 10:45:43 MET DST

  • Next message: Constantin Orasan: "Corpora: AHRB studentship in text summarisation"

    The TIGER German treebank sampler has been released!
    ----------------------------------------------------

    A large syntactically annotated corpus of German newspaper text
    is under construction in the TIGER project - with project partners
    in Saarbruecken, Potsdam, and Stuttgart.

    In order to get feedback from the research community, the TIGER project team
    has released a sampler of the TIGER corpus:

    http://www.ims.uni-stuttgart.de/projekte/TIGER/

    The TIGER corpus is annotated with 'syntax graphs', a generalization of
    syntax trees, in order to be able to account for phenomena involving
    discontinuous constituents. E.g.
    - long distance dependencies are encoded by crossing edges
    - coreference in coordination is represented by 'secondary edges'
    More details of the annotation scheme are available online, where you can
    also explore the TIGER corpus sampler interactively.

    ---
    The TIGER project team.
    Department of Computational Linguistics, Saarland University
    Institut fuer Germanistik, University of Potsdam
    Department of Natural Language Processing (IMS), University of Stuttgart
    email: tigercorpus@ims.uni-stuttgart.de
    



    This archive was generated by hypermail 2b29 : Wed Sep 12 2001 - 11:48:06 MET DST