=?utf-8?Q?25.5201, _FYI:_New_Release_of_the_T=C3=BCBa-D/Z_German_Treebank?=

The LINGUIST List via LINGUIST linguist at listserv.linguistlist.org
Fri Dec 19 18:07:31 UTC 2014


LINGUIST List: Vol-25-5201. Fri Dec 19 2014. ISSN: 1069 - 4875.

Subject: 25.5201, FYI: New Release of the TüBa-D/Z German Treebank

Moderators: Damir Cavar, Indiana U <damir at linguistlist.org>
            Malgorzata E. Cavar, Indiana U <gosia at linguistlist.org>

Reviews: reviews at linguistlist.org
Anthony Aristar <aristar at linguistlist.org>
Helen Aristar-Dry <hdry at linguistlist.org>
Sara Couture, Indiana U <sara at linguistlist.org>

Homepage: http://linguistlist.org

Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from
Amazon!

USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21

For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.

Editor for this issue: Uliana Kazagasheva <uliana at linguistlist.org>
================================================================


Date: Fri, 19 Dec 2014 13:06:37
From: Marie Hinrichs [marie.hinrichs at uni-tuebingen.de]
Subject: New Release of the TüBa-D/Z German Treebank

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=25-5201.html&submissionid=35992297&topicid=6&msgnumber=1
 
The Department of Linguistics of the University of Tübingen (Germany) is
pleased to announce a new minor release of its referentially and syntactically
annotated German corpus: The Tübingen Treebank of Written German (TüBa-D/Z) -
Release 9.1.

The TüBa-D/Z treebank is a manually annotated German newspaper corpus based on
data taken from the daily issues of the 'die tageszeitung'. It currently
comprises 85,358 sentences (1,569,916 words; 3,444 newspaper articles).

This minor release includes 17,910 manual annotations of a selected set of
lemmas (30 nouns, 79 verbs) with their corresponding senses in the German
wordnet GermaNet with the goal of providing a gold standard for word sense
disambiguation. Please note that no new sentences have been added between
release 9.0 and release 9.1. Only those formats that support word sense
annotation are part of this minor release (Negra Export 3 and 4, CoNLL
2011/2012, Export XML). Other formats remain unchanged and can be obtained
from release 9.0. 

The syntactic annotation scheme of the TüBa-D/Z distinguishes four levels of
syntactic constituency (lexical, phrasal, clausal, topological fields) and
contains the following annotation layers:
- inflectional morphology 
- lemmas 
- syntactic constituency 
- grammatical functions 
- (complex) named entities including semantic classification 
- anaphora and coreference relations 
- discourse connectives (explicit and implicit, partial coverage) 
- GermaNet word senses 
- dependency relations (automatically created) 
- chunk annotation (automatically created)

The license for TueBa-D/Z is granted free of charge for scientific use. For
more information, please visit the website at:
http://www.sfs.uni-tuebingen.de/en/ascl/resources/corpora/tueba-dz.html

Best Regards,

Erhard W. Hinrichs
Heike Telljohann
Marie Hinrichs
------------
Dept. of Computational Linguistics
University of Tübingen
Wilhelmstr. 19
72074 Tübingen
Germany
 



Linguistic Field(s): Computational Linguistics
                     Discourse Analysis
                     Morphology
                     Syntax
                     Text/Corpus Linguistics

Subject Language(s): German (deu)

Language Family(ies): Germanic





 






----------------------------------------------------------
LINGUIST List: Vol-25-5201	
----------------------------------------------------------







More information about the LINGUIST mailing list