[Corpora-List] word frequencies on the web

William Fletcher fletcher at usna.edu
Fri Dec 8 18:54:00 UTC 2006


Dear Tony,

I have lists of words occurring 100 or more and 10 or more times
respectively in the preliminary version of a dynamic Web Corpus I am
compiling for "Phrases in English".  Since you cannot reach PIE directly, I
put them on my KWiCFinder site:

http://www.kwicfinder.com/WebCorpus2006_min100.html 

tab-separated text files
http://www.kwicfinder.com/WebCorpus2006_min100.txt 
http://www.kwicfinder.com/WebCorpus2006_min10.txt

Corpus currently has 97,198,272 tokens and 525,509 types, of which 30,524
occur 100 or more times 104,675 tokens occur 10 or more times 

Regards,
Bill Fletcher

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Tony Berber Sardinha
Sent: Friday, December 08, 2006 11:44 AM
To: CORPORA
Subject: [Corpora-List] word frequencies on the web

Dear all, does anyone know of ways to estimate the frequency of words on the
web, or if there're search engines that supply this info (as Altavista used
to do)?

thank you!
tony
www2.lael.pucsp.br/~tony



More information about the Corpora mailing list