[Corpora-List] amount of text on the web?

radev at umich.edu radev at umich.edu
Tue Nov 13 17:37:26 UTC 2007


This is too old. I have seen this one and quoted it a lot.

> 
> Hi,
> 
> The numbers are a bit old but a very good study which investigates how
> much data is on the web is:
> 
> Lyman, Peter and Hal R. Varian (2003)  How much information =E2=80=93 2003.
> Technical report, School of Information Management and Systems,
> University of California at Berkeley.
> 
> http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/
> 
> Regards
> 
> Constantin
> 
> > I am looking for some up to date statistics on the amount of textual
> > data on the web. I have seen varied estimates ranging up to 1
> > Exabyte. I am sure that it is not possible to define precisely what
> > "text on the web" means (do you include email, cached text, local
> > files, "hidden" web, etc).
> >=20
> > Drago
> 
> --=20
> Constantin Orasan <C.Orasan at wlv.ac.uk>
> Lecturer in Computational Linguistics
> Research Group in Computational Linguistics
> http://www.wlv.ac.uk/~in6093/
> University of Wolverhampton
> 
> 


-- 
Dragomir R. Radev                    Associate Professor
SI, CSE, Ling                     U. Michigan, Ann Arbor 
http://www.eecs.umich.edu/~radev         radev at umich.edu              

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list