[Corpora-List] Free text corpora?
Francis Tyers
ftyers at prompsit.com
Tue Mar 2 22:03:17 UTC 2010
El dt 02 de 03 de 2010 a les 23:55 +0100, en/na Martin Wynne va
escriure:
> Francis Tyers wrote:
> > El dt 02 de 03 de 2010 a les 12:38 +0100, en/na Xin Yan va escriure:
> >
> >> Hello,
> >>
> >> can anyone tell me, if there are some free text corpora
> >> for commercial purpose?
> >> Thank you in advance!
> >>
> >
> > You can download dumps of Wikipedia from http://download.wikimedia.org
> > -- they are licensed under the CC-BY-SA or GFDL -- both of which allow
> > commercial use, providing changes made are redistributed under the same
> > licence.
> >
> > Best regards,
> >
> > Fran
> >
> >
> > _______________________________________________
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
>
> Dumps of wikipedia may be an interesting electronic text collection that
> can be used to help address various linguistic research questions, but I
> think that the request was for a corpus...and a "dump" such as this
> couldn't be further from qualifying as a corpus, if defined as "a
> collection of pieces of language, selected and ordered according to
> explicit linguistic criteria in order to be used as a sample of the
> language.”
There are a good many people who are comfortable with the definition of
a corpus as a "crapload of text" ;)
And the request was for a corpus "free for commercial use", and the bad
news is that the majority of texts which are:
"a collection of pieces of language, selected and ordered according
to explicit linguistic criteria in order to be used as a sample of
the language."
are not free for commercial use -- be that "free as in speech" or "free
as in beer" -- although I'd be delighted to hear otherwise.
Best,
Fran
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list