[Corpora-List] Qualitative / Quantitative survey of Wikipedia dumps as Corpora

liling tan alvations at gmail.com
Wed Mar 13 01:26:31 UTC 2013


Dear all,

Wikipedia dumps have been popular source of texts for NLP due to its
availability and the sheer size.

I would like to ask whether anyone had conducted quantitative or
qualitative survey on

   - how useful are these dumps to NLP and
   - what are the issues that will surface when using wikipedia dumps as
   corpora.


Regards,
liling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130313/806ecc08/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list