[Corpora-List] free corpus

Cyrus Shaoul cyrus.shaoul at ualberta.ca
Tue Nov 20 18:43:14 UTC 2007


Hi Peter,

I have developed a fairly large USENET corpus (ie: it is big, currently 
13 billion words and growing)
 and it is available to all under a creative commons license (ie: it is 
free for non-commercial use. Just cite us if you use it
for research).
 
It is untagged. It contains text written in a broad range of genres and 
registers.

It is currently limited to 1gb of downloads a day from non-academic 
networks, but that is not too shabby.

It is available here:

    
http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html

Enjoy,

Cyrus

Peter Isaev wrote:
> Hello.
>
> I'm looking for free big corpus, consisting of plain text, something 
> like BNC corpus (it is not free).
>
> Where can I download it?
>
> Thank you.
> ------------------------------------------------------------------------
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>   

-- 
=[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}
Cyrus Shaoul
http://www.psych.ualberta.ca/~westburylab/
University of Alberta
=[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list