[Corpora-List] Word frequencies from a corpus of movie and TV show subtitles.
Mark Davies
Mark_Davies at byu.edu
Tue Apr 14 19:26:57 UTC 2009
> where you can download the manuscript and the new frequency norms for
> (American) English based on subtitles that
> are much better than the Kucera & Francis norms and certainly for short words, than
> ** any other norm currently available. **
? ? ?
Are they based on a balanced corpus, or just subtitles from TV shows?
For frequency information from a large, balanced corpus (spoken, fiction, popular magazines, newspaper, academic), might try:
http://www.americancorpus.org
============================================
Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906
http://davies-linguistics.byu.edu
** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list