[Corpora-List] web page corpus?

Adam Jatowt jatowt at miv.t.u-tokyo.ac.jp
Mon May 12 06:21:50 UTC 2003


Dear all,

Does anyone know corpus of any web pages which would reflect historical data of web pages changing in time? 
Internet Archive (archive.org) contains such data but they were collected in different time intervals for different pages so many previous page versions are missing. I am doing reserch on text changes in WWW communities.

Thank you.

Adam



More information about the Corpora mailing list