[Corpora-List] free corpus
Cyrus Shaoul
cyrus.shaoul at ualberta.ca
Tue Nov 20 18:43:14 UTC 2007
Hi Peter,
I have developed a fairly large USENET corpus (ie: it is big, currently
13 billion words and growing)
and it is available to all under a creative commons license (ie: it is
free for non-commercial use. Just cite us if you use it
for research).
It is untagged. It contains text written in a broad range of genres and
registers.
It is currently limited to 1gb of downloads a day from non-academic
networks, but that is not too shabby.
It is available here:
http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html
Enjoy,
Cyrus
Peter Isaev wrote:
> Hello.
>
> I'm looking for free big corpus, consisting of plain text, something
> like BNC corpus (it is not free).
>
> Where can I download it?
>
> Thank you.
> ------------------------------------------------------------------------
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
--
=[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}
Cyrus Shaoul
http://www.psych.ualberta.ca/~westburylab/
University of Alberta
=[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list