There is a very good, up-to-date discussion on this in: WORD FREQUENCY DISTRIBUTIONS R. Harald Baayen Tect, Speech and Language Technology Series Kluwer Academic Publishers, Dordrecht Hardbound, ISBN 0-7923-7017-1 June 2001, 356 pp. see: http://www.wkap.nl/book.htm/0-7923-7017-1 Jean Véronis http://www.up.univ-mrs.fr/~veronis/