[Corpora-List] web page corpus?

Elisabeth Burr Elisabeth.Burr at uni-duisburg.de
Mon May 12 09:19:03 UTC 2003


The only thing I know of are the pages of the French government where
former sites have been archived

http://www.premier-ministre.gouv.fr/fr/

Elisabeth Burr

At 15:21 12.05.03 +0900, you wrote:
>Dear all,
>
>Does anyone know corpus of any web pages which would reflect historical 
>data of web pages changing in time?
>Internet Archive (archive.org) contains such data but they were collected 
>in different time intervals for different pages so many previous page 
>versions are missing. I am doing reserch on text changes in WWW communities.
>
>Thank you.
>
>Adam

HD Dr. Elisabeth Burr
Fakultät 2 / Romanistik
Universität Duisburg-Essen
Standort Duisburg
Geibelstr. 41
D-47058 Duisburg

http://www.uni-duisburg.de/Fak2/FremdPhil/Romanistik/Personal/Burr/



More information about the Corpora mailing list