[Corpora-List] Web pages corpus

ismi.touati ismi.touati at laposte.net
Mon Mar 6 11:29:42 UTC 2006


Dear all,

I'm working on automatic summarization of web pages, i'm looking for a corpus of web 

pages (html documents) with their abstract to evaluate my system. 

Does anyone knows if such a corpus exists?

Thanks in advance for the help.
Imen.

***********************************
Imen Touati
Master Student at Faculty of Economic Science and management of sfax, 
Tunisia.
LARIS laboratory
Addresse : LARIS, FSEGS, BP 1088, 3018 Sfax, Tunisia

Accédez au courrier électronique de La Poste : www.laposte.net ; 
3615 LAPOSTENET (0,34 €/mn) ; tél : 08 92 68 13 50 (0,34€/mn)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20060306/af219134/attachment.htm>


More information about the Corpora mailing list