[Corpora-List] amount of text on the web?

Constantin Orasan c.orasan at wlv.ac.uk
Tue Nov 13 17:37:24 UTC 2007


Hi,

The numbers are a bit old but a very good study which investigates how
much data is on the web is:

Lyman, Peter and Hal R. Varian (2003)  How much information – 2003.
Technical report, School of Information Management and Systems,
University of California at Berkeley.

http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/

Regards

Constantin

> I am looking for some up to date statistics on the amount of textual
> data on the web. I have seen varied estimates ranging up to 1
> Exabyte. I am sure that it is not possible to define precisely what
> "text on the web" means (do you include email, cached text, local
> files, "hidden" web, etc).
> 
> Drago

-- 
Constantin Orasan <C.Orasan at wlv.ac.uk>
Lecturer in Computational Linguistics
Research Group in Computational Linguistics
http://www.wlv.ac.uk/~in6093/
University of Wolverhampton

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list