[Corpora-List] New German Corpora Released in the Mannheimer Corpus Collection

Cyril Belica belica at ids-mannheim.de
Fri Nov 8 08:44:19 UTC 2002


We are pleased to announce the availability of new German corpora
in the Mannheimer Corpus Collection. For a list of currently available
corpora see http://www.ids-mannheim.de/kt/textorg.html (in German).

The Mannheimer Corpus Collection is the world's largest, growing,
collection of German online corpora for linguistic research  (see
http://www.ids-mannheim.de/kt/corpora.shtml). Launched in mid 1960's,
it reached 1.85 billion words in November 2002. Since 1993, the
copyright-free part (currently 1.1 billion words) of the Collection is
publicly available for searching via the COSMAS I online toolbox.
Invited guests have access to the whole Mannheimer Corpus Collection. 
The corpora on offer cover a wide variety of sources, e.g. classic literary 
texts, national and regional newspapers, spoken language in transcribed
form, morphosyntactically annotated texts and several unique corpora.
Commercial use is not permitted. No downloads.

COSMAS I (see http://www.ids-mannheim.de/kt/cosmas.shtml) is a
powerful online corpus search and analysis toolbox. First network
release in 1993, Web interface since 1997, fast indexer, complex
query language, concordancing, online collocation analysis and
clustering since 1995, German lemmatizer and compound analyzer,
virtual corpus composition, more than 3000 registered users in 
November 2002. Gives online access to the Mannheimer Corpus
Collection. Commercial use is not permitted. No download possible.

Cyril Belica
Head of the Corpus Technology Research Group (http://www.ids-mannheim.de/kt) 
Institut für Deutsche Sprache (http://www.ids-mannheim.de)
Mannheim
Germany



More information about the Corpora mailing list