[Corpora-List] Free text corpora?

Martin Wynne martin.wynne at oucs.ox.ac.uk
Tue Mar 2 22:55:21 UTC 2010


Francis Tyers wrote:
> El dt 02 de 03 de 2010 a les 12:38 +0100, en/na Xin Yan va escriure:
>   
>> Hello,
>>
>> can anyone tell me, if there are some free text corpora 
>> for commercial purpose?
>> Thank you in advance!
>>     
>
> You can download dumps of Wikipedia from http://download.wikimedia.org
> -- they are licensed under the CC-BY-SA or GFDL -- both of which allow
> commercial use, providing changes made are redistributed under the same
> licence.
>
> Best regards,
>
> Fran
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>   

Dumps of wikipedia may be an interesting electronic text collection that 
can be used to help address various linguistic research questions, but I 
think that the request was for a corpus...and a "dump" such as this 
couldn't be further from qualifying as a corpus, if defined as "a 
collection of pieces of language, selected and ordered according to 
explicit linguistic criteria in order to be used as a sample of the 
language.”

The good news is that corpora are available. If you let us know what 
sort of corpus you are looking for and for what sort of commercial uses 
you intend to put them to, I am sure that there are plenty of people 
here on the mailing list who can help you.

Martin
Oxford Text Archive

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list