[Corpora-List] Looking for Corpora in: English, Swedish, Polish, Italian, Finnish, Estonian, Hungarian
Kristian Kankainen
kristian at eki.ee
Tue May 6 08:00:15 UTC 2014
Dear all,
There is also a quite comprehensive list of all sorts of resources for
Estonian (wordlists, biographical data collections, dialect data,
phonetical resources, spoken language, internet language, learner
language corpora, etc) here:
http://viki.keeleleek.ee/wiki/Eesti_keele_ressursside_loend
All descriptions are in estonian only now and it's quite stupidly
organized as a textual list or collection of links. It's open for
everyone to edit.
All the best
Kristian Kankainen
Ühel kenal päeval, P, 23.03.2014 kell 08:11, kirjutas anne tamm:
> Dear Marina,
>
> The following pages lead to further corpora in Hungarian and Estonian.
>
> Hungarian: http://www.nytud.hu/dbases/index.html
> Estonian: http://www.keeletehnoloogia.ee/projektid/koondkorpus
>
> Best,
> Anne Tamm
>
>
>
>
> On Sunday, March 23, 2014 4:01 PM, Marina Santini
> <marinamailinglists at gmail.com> wrote:
>
> Hi,
>
> I am looking for corpora of any genre in the following languages:
> English, Swedish, Polish, Italian, Finnish, Estonian, and Hungarian.
> I am already aware of a number of corpora (several posts in the
> WebGenre blog are dedicated to the dissemination of corpora-related
> information). These corpora, though, are mostly in English. I would
> like now to focus on: 1) additional languages and 2) additional
> genres, such as search query logs, tv scripts, emails, tweets, whatsup
> messages, etc.
> All genres are well accepted! The only requirement is: corpora must be
> free and publicly available. Everybody must be able to replicate or
> extend experiments using the same corpora/datasets.
>
> The purpose of the experiments is to explore cross-linguality in
> different settings. Please, read the use cases in the blog post to
> have an idea of the type of communicative situations under
> investigation
> (http://www.forum.santini.se/2014/03/looking-for-corpora-to-explore-cross-linguality/)
>
> Thanx in advance for your suggestions and pointers.
> --
>
> Marina Santini
> http://www.forum.santini.se
> http://www.linkedin.com/groups/WebGenre-R-D-Group-4301498
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list