[Corpora-List] Word frequencies from a corpus of movie and TV show subtitles.

Mark Davies Mark_Davies at byu.edu
Tue Apr 14 19:26:57 UTC 2009


> where you can download the manuscript and the new frequency norms for
> (American) English based on subtitles that 

> are much better than the Kucera & Francis norms and certainly for short words, than 
> ** any other norm currently available.  **

? ? ?

Are they based on a balanced corpus, or just subtitles from TV shows?

For frequency information from a large, balanced corpus (spoken, fiction, popular magazines, newspaper, academic), might try:

http://www.americancorpus.org

============================================
Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906

http://davies-linguistics.byu.edu

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================ 

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list