[Corpora-List] intuitions about co-occurrence: SUMMARY

Marco Baroni marco.baroni at unitn.it
Tue Apr 30 07:55:26 UTC 2013


Dear Corpora Listers,

About one week ago I asked this list for references concerning the 
following question:

"Is anybody aware of experimental studies where researchers have looked 
at whether subjects' explicit intuitions about the probability of 
co-occurrence of two terms correlate with (functions of) the frequency 
of co-occurrence of the two terms in a corpus?"

Several people have kindly replied to my inquiry: Alessandro Lenci, 
Michal Ptaszynki, Adam Kilgarriff, Stefania Spina, Sylviane Granger, 
Luca Toldo, Paul Nulty, Ute Römer, David Oakey and Adriano Ferraresi. 
Thanks to you all!

In the end, none of the papers that were suggested addressed the 
question above, although they did address a number of related (and 
interesting) questions, such as correlation between corpus frequencies 
and plausibility judgments, intuitions about collocativity, etc. Here is 
a list of the references that were mentioned in the thread:

Pawel Dybala, Michal Ptaszynski, Kohichi Sayama:
“Reducing Excessive Amounts of Data: Multiple Web Queries for Generation 
of Pun Candidates”
Advances in Artificial Intelligence, vol. 2011, Article ID 107310, 12
pages, 2011. doi:10.1155/2011/107310
http://downloads.hindawi.com/journals/aai/2011/107310.pdf

Ellis, Nick C., Matthew B. O'Donnell & Ute Römer. 2013. Usage-based 
language: Investigating the latent structures that underpin acquisition. 
/Language Learning /63(Supp. 1): 25-51.

Fox, G. (1987). The case for examples. In J. McH. Sinclair (Ed.), 
Looking Up: an account of the COBUILD project in lexical computing (pp. 
137-149). London: Collins ELT.

Frank Keller and Maria Lapata. 2003. Using the Web to Obtain Frequencies 
for Unseen Bigrams. Computational Linguistics, 29:459--484.

Maria Lapata, Frank Keller, and Scott McDonald. 2001. Evaluating 
Smoothing Algorithms against Plausibility Judgments. In Proceedings of 
the 39th Annual Meeting of the Association for Computational Linguistics 
and the 10th Conference of the European Chapter of the Association for 
Computational Linguistics, 346--353. Toulouse.

Maria Lapata, Scott McDonald, and Frank Keller. 1999. Determinants of 
Adjective-Noun Plausibility. In Proceedings of the 9th Conference of the 
European Chapter of the Association for Computational Linguistics, 
30--36. Bergen.

IAIN MCGEE (2009) Adjective-noun collocations in elicited and corpus 
data: Similarities, differences, and the whys and wherefores. Corpus 
Linguistics and Linguistic Theory 5(1), 79–103

McRae, K., & Spivey-Knowlton, M. J., & Tanenhaus, M. K.(1998). Modeling 
the influence of thematic fit (and other constraints) in on-line 
sentence comprehension. Journal of Memory and Language, 38, 283-312.

R. SIMPSON-VLACH and N. C. ELLIS (2010). An Academic Formulas List: New 
Methods in Phraseology Research. /Applied Linguistics:/ 31/4: 487–512

Siyanova, Schmitt, 2008. L2 Learner Production and Processing of 
Collocation: A Multi-study Perspective. The Canadian Modern Language 
Review/La Revue canadienne des langues vivantes, 64, 3 (March/mars), 429–458

Stefanie Wulff /Rethinking Idiomaticity/ 
(http://books.google.com/books/about/Rethinking_Idiomaticity.html?id=uD5oet7okb0C) 



Hope this list will be useful to some other Corpora subscribers, as it 
will be to me!


Regards,


Marco




-- 
Marco Baroni
Center for Mind/Brain Sciences (CIMeC)
University of Trento
http://clic.cimec.unitn.it/marco

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list