[Corpora-List] German corpora
    Ralf Steinberger 
    ralf.steinberger at jrc.it
       
    Fri Jan 25 13:31:40 UTC 2008
    
    
  
Hello Jaime,
 
You may want to have a look at the German documents of the freely available
JRC-Acquis corpus, downloadable from
 
            http://langtech.jrc.it/JRC-Acquis.html
 
The document collection covers the last 50 years or so, but texts are
organised chronologically so that you can pick those of interest to you. The
corpus covers written language only, though.
 
I hope this helps. Kind regards,
 
Ralf
 
 
 
Ralf Steinberger ( <mailto:Ralf.Steinberger at jrc.it> Ralf.Steinberger at jrc.it)
European Commission - Joint Research Centre (JRC)
IPSC - SeS - Language Technology 
URL: Applications: http://emm.jrc.it/overview.html
URL: The science behind them:  <http://langtech.jrc.it/>
http://langtech.jrc.it.
JRC-Acquis Multilingual Parallel Corpus (Version 3)
*       Freely available for research purposes.
*       22 languages: Bulgarian, Czech, Danish, German, Greek, English,
Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian,
Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.
*       Altogether over 1 Billion words.
*       Sentence alignment for 231 language pairs.
*       For more information and download, see
<http://langtech.jrc.it/JRC-Acquis.html>
http://langtech.jrc.it/JRC-Acquis.html.
 
DGT-Translation Memory
*       Freely available for research purposes.
*       Aligned translation units for 231 language pairs.
*       Alignment manually verified.
*       For more information and download, see
http://langtech.jrc.it/DGT-TM.html.
 
The JRC's Language Technology group specialises in the development of highly
multilingual text analysis tools and in cross-lingual applications. Many
applications are accessible online, e.g.:
*        <http://press.jrc.it/NewsExplorer/> NewsExplorer: multilingual news
aggregation and analysis (19 languages); allows to navigate the news over
time and across languages; trend analysis; collects information about people
from the news; social network detection.
*        <http://press.jrc.it/> NewsBrief: breaking news detection and
display of the very latest thematic news from around the world; email
alerting (22+ languages).
*        <http://medusa.jrc.it/> MedISys Medical Information System: latest
health-related news from around the world according to themes and diseases
(22+ languages).
*       EMM-Labs <http://emm-labs.jrc.it:8080/> : Latest developments;
social networks; live people-in-the-news; country and theme fact sheets;
maps showing violent events world-wide.
 
 
 
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
jaime.hunt at studentmail.newcastle.edu.au
Sent: 25 January 2008 10:10
To: Corpora at uib.no
Subject: [Corpora-List] German corpora
 
Hello 
 
I'm a PhD student interested in researching German corpora for Anglicisms. I
am searching for all types of recent corpora, especially spoken corpora,
from around 2005 onwards.
 
I would really appreciate it if anybody is able to make any suggestions as
to what might be available for students to research free of charge.
 
Best regards,
Jaime
 
Mr Jaime Hunt MAppLing (TESOL), BA (Hons)
PhD (Linguistics) Candidate
School of Humanities and Social Science
McMullin Building
University of Newcastle 
Callaghan
NSW 2308
Australia
 
Ph. +61 (0)2 4921 5175
Email: jaime.hunt at studentmail.newcastle.edu.au
 
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080125/626bdf46/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
    
    
More information about the Corpora
mailing list