[Corpora-List] Web pages corpus
Jakob Halskov
jh.id at cbs.dk
Mon Mar 6 12:21:01 UTC 2006
Dear Imen,
It is very easy to compile a web corpus on your own using one of the freely available web search APIs. See for example:
http://developer.yahoo.net/search/index.html
or
http://www.google.com/apis/
Best regards,
Jakob Halskov
--
PhD student
Dept. of Computational Linguistics
Copenhagen Business School
www.id.cbs.dk
----- Original Message -----
From: "ismi.touati" <ismi.touati at laposte.net>
Date: Monday, March 6, 2006 12:29 pm
Subject: [Corpora-List] Web pages corpus
> Dear all,
>
> I'm working on automatic summarization of web pages, i'm looking
> for a corpus of web
>
> pages (html documents) with their abstract to evaluate my system.
>
> Does anyone knows if such a corpus exists?
>
> Thanks in advance for the help.
> Imen.
>
> ***********************************
> Imen Touati
> Master Student at Faculty of Economic Science and management of
> sfax,
> Tunisia.
> LARIS laboratory
> Addresse : LARIS, FSEGS, BP 1088, 3018 Sfax, Tunisia
>
> Accédez au courrier électronique de La Poste : www.laposte.net ;
> 3615 LAPOSTENET (0,34 ?/mn) ; tél : 08 92 68 13 50 (0,34?/mn)
>
>
>
More information about the Corpora
mailing list