[Corpora-List] amount of text on the web?
radev at umich.edu
radev at umich.edu
Tue Nov 13 17:37:26 UTC 2007
This is too old. I have seen this one and quoted it a lot.
>
> Hi,
>
> The numbers are a bit old but a very good study which investigates how
> much data is on the web is:
>
> Lyman, Peter and Hal R. Varian (2003) How much information =E2=80=93 2003.
> Technical report, School of Information Management and Systems,
> University of California at Berkeley.
>
> http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/
>
> Regards
>
> Constantin
>
> > I am looking for some up to date statistics on the amount of textual
> > data on the web. I have seen varied estimates ranging up to 1
> > Exabyte. I am sure that it is not possible to define precisely what
> > "text on the web" means (do you include email, cached text, local
> > files, "hidden" web, etc).
> >=20
> > Drago
>
> --=20
> Constantin Orasan <C.Orasan at wlv.ac.uk>
> Lecturer in Computational Linguistics
> Research Group in Computational Linguistics
> http://www.wlv.ac.uk/~in6093/
> University of Wolverhampton
>
>
--
Dragomir R. Radev Associate Professor
SI, CSE, Ling U. Michigan, Ann Arbor
http://www.eecs.umich.edu/~radev radev at umich.edu
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list