[Corpora-List] Movie-Dialogs Corpus

Wed Jul 4 18:29:45 UTC 2012

Announcing the availability of the Cornell Movie-Dialogs Corpus, a large, metadata-rich collection of conversations extracted from movie scripts.  The data includes over 220,000 conversational exchanges involving in total 9000+ characters from 617 movies.  Prior uses of this corpus include:

* Cristian Danescu-Niculescu-Mizil, Justin Cheng, Jon Kleinberg and Lillian Lee. 
 "You had me at hello: How phrasing affects memorability". ACL 2012.

* Tyler Schnoebelen, Feb 2012: how "like" and "I mean" vary across movie genre, 
gender, and cast position. 
http://corplinguistics.wordpress.com/2012/02/23/like-lets-go-to-the-movies-i-mean/

* Cristian Danescu-Niculescu-Mizil and Lillian Lee, "Chameleons in imagined 
conversations: A new approach to understanding coordination of linguistic style 
in dialogs", ACL 2011 workshop on Cognitive Modeling and Computational Linguistics.

The download site is: 
http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html

Cristian Danescu-Niculescu-Mizil and Lillian Lee
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora