[Corpora-List] Looking for Corpora in: English, Swedish, Polish, Italian, Finnish, Estonian, Hungarian

Ralf Steinberger ralf.steinberger at jrc.ec.europa.eu
Sun Mar 23 17:12:13 UTC 2014


Dear Marina,

 

At the JRC's Language Technology page
http://ipsc.jrc.ec.europa.eu/index.php?id=61, you find parallel corpora for
all the languages you are searching for, and more.

 

All the best,

 

Ralf

 

Ralf Steinberger 

European Commission - Joint Research Centre (JRC)

 

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Marina Santini
Sent: 23 March 2014 15:26
To: corpora at uib.no; Marina Santini
Subject: [Corpora-List] Looking for Corpora in: English, Swedish, Polish,
Italian, Finnish, Estonian, Hungarian

 

Hi, 


I am looking for corpora of any genre in the following languages: English,
Swedish, Polish, Italian, Finnish, Estonian, and Hungarian. 
I am already aware of a number of corpora (several posts in the WebGenre
blog are dedicated to the dissemination of corpora-related information).
These corpora, though, are mostly in English. I would like now to focus on:
1) additional languages and 2) additional genres, such as search query logs,
tv scripts, emails, tweets, whatsup messages, etc. 
All genres are well accepted! The only requirement is: corpora must be free
and publicly available. Everybody must be able to replicate or extend
experiments using the same corpora/datasets. 

The purpose of the experiments is to explore cross-linguality in different
settings. Please, read the use cases in the blog post to have an idea of the
type of communicative situations under investigation
(http://www.forum.santini.se/2014/03/looking-for-corpora-to-explore-cross-li
nguality/)


Thanx in advance for your suggestions and pointers. 

-- 

Marina Santini

http://www.forum.santini.se 
 <http://www.linkedin.com/groups/WebGenre-R-D-Group-4301498>
http://www.linkedin.com/groups/WebGenre-R-D-Group-4301498

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140323/876d6334/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list