29.2267, Software: Database for Spoken German (DGD) v2.10

The LINGUIST List linguist at listserv.linguistlist.org
Fri May 25 17:09:19 UTC 2018


LINGUIST List: Vol-29-2267. Fri May 25 2018. ISSN: 1069 - 4875.

Subject: 29.2267, Software: Database for Spoken German (DGD) v2.10

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté,
                                   Michael Czerniakowski)
Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Kenneth Steimel <ken at linguistlist.org>
================================================================


Date: Fri, 25 May 2018 13:08:57
From: Thomas Schmidt [thomas.schmidt at ids-mannheim.de]
Subject: Database for Spoken German (DGD) v2.10

 
This week, we released version 2.10 of the Database for Spoken German
(Datenbank für Gesprochenes Deutsch, DGD) at:

https://dgd.ids-mannheim.de

After a one time registration, the DGD is free to use for research and
teaching. The DGD provides access to oral corpora from the Archive for Spoken
German (http://agd.ids-mannheim.de). Among the resources available in the DGD
are:

- FOLK, the Research and Teaching Corpus of Spoken German - a 230h (2.2
million tokens) collection of audio and video recordings of authentic
interaction in private, institutional and public settings. All data have been
transcribed according to the GAT convention, aligned with the recordings and
annotated with an orthographic normalisation, lemmatisation and POS tagging
according to STTS. For more information on FOLK, please see
http://agd.ids-mannheim.de/folk.shtml

- GeWiss (''Gesprochene Wissenschaftssprache Kontrastiv'') - a 1 million
tokens corpus of spoken academic language (exam talks, student and expert
presentations) collected by the GeWiss project in Leipzig, Wroclaw and
Birmingham

- Deutsche Mundarten (German dialects, ''Zwirner-Korpus'') - the largest
corpus documenting German dialects, and a series of other dialect corpora
following a similar design

- Emigrantendeutsch in Israel (Emigrant German in Israel) - three collections
of biographic interviews with German speaking emigrants in Israel, collected
in various projects by Anne Betten

- Monash Corpus of Australian German - a corpus collected by Michael Clyne
documenting language use of the German speaking community in South Australia

The new version includes an extension of FOLK and two new corpora: RUDI
(''Russlanddeutsche Dialekte'') with recordings of German speakers from the
former Soviet Union and BETV (''Belgische TV-Debatten'') with videos from TV
debates from the German speaking part of Belgium.

The DGD can be used to browse these data, to do systematic queries on metadata
and transcripts, and to download excerpts from the corpora.


Linguistic Field(s): Applied Linguistics
                     General Linguistics
                     Language Documentation
                     Pragmatics
                     Sociolinguistics
                     Text/Corpus Linguistics

Subject Language(s): German (deu)



------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:

              The IU Foundation Crowd Funding site:
       https://iufoundation.fundly.com/the-linguist-list

               The LINGUIST List FundDrive Page:
            http://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-29-2267	
----------------------------------------------------------
Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.org/







More information about the LINGUIST mailing list