Corpora: German treebank sampler

TIGER corpus team tigercorpus at ims.uni-stuttgart.de
Wed Sep 12 08:45:43 UTC 2001


The TIGER German treebank sampler has been released!
----------------------------------------------------

A large syntactically annotated corpus of German newspaper text
is under construction in the TIGER project - with project partners
in Saarbruecken, Potsdam, and Stuttgart.

In order to get feedback from the research community, the TIGER project team
has released a sampler of the TIGER corpus:

http://www.ims.uni-stuttgart.de/projekte/TIGER/

The TIGER corpus is annotated with 'syntax graphs', a generalization of
syntax trees, in order to be able to account for phenomena involving
discontinuous constituents. E.g.
- long distance dependencies are encoded by crossing edges
- coreference in coordination is represented by 'secondary edges'
More details of the annotation scheme are available online, where you can
also explore the TIGER corpus sampler interactively.

---
The TIGER project team.
Department of Computational Linguistics, Saarland University
Institut fuer Germanistik, University of Potsdam
Department of Natural Language Processing (IMS), University of Stuttgart
email: tigercorpus at ims.uni-stuttgart.de



More information about the Corpora mailing list