[Corpora-List] Web Content Extractor / Screen Scraper

Resty Cena restycena at gmail.com
Mon Jun 18 19:35:34 UTC 2007


Hello,
I am looking for a free or open-source Windows utility/application that
extract text-only rendered (not raw) contents of web pages, such as one
would use for automatically scraping news feeds. Does anyone use such an
application?

Basically the application will be used to harvest texts on the internet to
build a corpus.

All the best,
Resty
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070619/496b97ce/attachment.htm>


More information about the Corpora mailing list