[Corpora-List] Looking for Corpora in: English, Swedish, Polish, Italian, Finnish, Estonian, Hungarian

Kristian Kankainen kristian at eki.ee
Tue May 6 08:00:15 UTC 2014


Dear all,

There is also a quite comprehensive list of all sorts of resources for
Estonian (wordlists, biographical data collections, dialect data,
phonetical resources, spoken language, internet language, learner
language corpora, etc) here:

http://viki.keeleleek.ee/wiki/Eesti_keele_ressursside_loend

All descriptions are in estonian only now and it's quite stupidly
organized as a textual list or collection of links. It's open for
everyone to edit.

All the best
Kristian Kankainen

Ühel kenal päeval, P, 23.03.2014 kell 08:11, kirjutas anne tamm:
> Dear Marina,
> 
> The following pages lead to further corpora in Hungarian and Estonian.
> 
> Hungarian: http://www.nytud.hu/dbases/index.html
> Estonian: http://www.keeletehnoloogia.ee/projektid/koondkorpus
> 
> Best,
> Anne Tamm
> 
> 
> 
> 
> On Sunday, March 23, 2014 4:01 PM, Marina Santini
> <marinamailinglists at gmail.com> wrote:
> 
> Hi, 
> 
> I am looking for corpora of any genre in the following languages:
> English, Swedish, Polish, Italian, Finnish, Estonian, and Hungarian. 
> I am already aware of a number of corpora (several posts in the
> WebGenre blog are dedicated to the dissemination of corpora-related
> information). These corpora, though, are mostly in English. I would
> like now to focus on: 1) additional languages and 2) additional
> genres, such as search query logs, tv scripts, emails, tweets, whatsup
> messages, etc. 
> All genres are well accepted! The only requirement is: corpora must be
> free and publicly available. Everybody must be able to replicate or
> extend experiments using the same corpora/datasets. 
> 
> The purpose of the experiments is to explore cross-linguality in
> different settings. Please, read the use cases in the blog post to
> have an idea of the type of communicative situations under
> investigation
> (http://www.forum.santini.se/2014/03/looking-for-corpora-to-explore-cross-linguality/)
> 
> Thanx in advance for your suggestions and pointers. 
> -- 
> 
> Marina Santini
> http://www.forum.santini.se 
> http://www.linkedin.com/groups/WebGenre-R-D-Group-4301498
> 
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> 
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora



_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list