Corpora: German treebank sampler
TIGER corpus team
tigercorpus at ims.uni-stuttgart.de
Wed Sep 12 08:45:43 UTC 2001
The TIGER German treebank sampler has been released!
----------------------------------------------------
A large syntactically annotated corpus of German newspaper text
is under construction in the TIGER project - with project partners
in Saarbruecken, Potsdam, and Stuttgart.
In order to get feedback from the research community, the TIGER project team
has released a sampler of the TIGER corpus:
http://www.ims.uni-stuttgart.de/projekte/TIGER/
The TIGER corpus is annotated with 'syntax graphs', a generalization of
syntax trees, in order to be able to account for phenomena involving
discontinuous constituents. E.g.
- long distance dependencies are encoded by crossing edges
- coreference in coordination is represented by 'secondary edges'
More details of the annotation scheme are available online, where you can
also explore the TIGER corpus sampler interactively.
---
The TIGER project team.
Department of Computational Linguistics, Saarland University
Institut fuer Germanistik, University of Potsdam
Department of Natural Language Processing (IMS), University of Stuttgart
email: tigercorpus at ims.uni-stuttgart.de
More information about the Corpora
mailing list