[Corpora-List] Word frequencies in English, French, German, Spanish, Dutch, Italian and Portuguese

Uwe Quasthoff quasthoff at informatik.uni-leipzig.de
Mon Feb 12 16:48:55 UTC 2007


Hi,

please have a look at http://corpora.informatik.uni-leipzig.de/download.html
You will find frequency lists as plain text (words.txt) and MySQL data 
files (words) (sorry, not for Portuguese at the moment) calculated from 
corpora of 100.000 to 3.000.000 sentences, depending on the language.
In addition, you can get the corpora and pre-calculated co-occurrences.

Regards,

Uwe Quasthoff




Yorick Wilks schrieb:
> Does anyone know easily accessible sources of these?
> Yorick Wilks
> Sheffield
>



More information about the Corpora mailing list