[Corpora-List] Word frequencies from a corpus of movie and TV show subtitles.

Cyrus Shaoul cyrus.shaoul at ualberta.ca
Wed Apr 15 18:36:37 UTC 2009


I did not make the corpus, but it looks like they used OCR technology to get
the subtitles off the video frames.

See:

http://expsy.ugent.be/subtlexus

for more info.

-Cyrus



M.E.Sciubba wrote:
> The subtitles refer to the 'whole' script or are they the 35-character
> subtitles given in movies?
> e.
>
>
>
>   

-- 
=[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}
Cyrus Shaoul
http://www.psych.ualberta.ca/~westburylab/
University of Alberta
780-492-5843
=[=]={=}=[=]={=}=[=]={=}=[=]={=}=[=]={=}



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list