[Corpora-List] Introducing the German Political Speeches Corpus

Adrien Barbaresi adrien.barbaresi at ens-lyon.fr
Tue Jul 26 14:31:42 UTC 2011


I am currently working on a resource I would like to introduce : the
German Political Speeches Corpus. It consists in speeches by the last
German Presidents, Chancellors and a few ministers, all gathered from
official sources.

As far I as know no such corpus was publicly available for German. Most
speeches could not be found on Google until today.
It can be freely republished.

The two sub-corpora (Presidency and Chancellery) are released in XML
format basing on raw text and metadata.
There is a series of improvements I plan, among which a better
tokenization and POS-tags.

I am also working on a basic visualization tool enabling users to get a
first glimpse of the resource, using simple text statistics in form of
XHTML pages (a sort of Zeitgeist).

Here is the permanent URL of the resource :
http://purl.org/corpus/german-speeches
Additional information and download there.

Regards,

Adrien Barbaresi

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list