I worked on this topic some time back (wow, it's *quite* some time back now, sheesh...) developing a computational model of verb-argument selectional preferences and validating it using comparisons with human ratings of just the kind you describe:

<br><br><div style="margin-left: 40px;"><span> Philip Resnik,  `Selectional Constraints: An Information-Theoretic Model and its Computational Realization'', Cognition, 61:127-159, November 1996.</span><br><span><a href="http://scholar.google.com/scholar?q=author:%22Resnik%22%20intitle:%22Selectional%20constraints:%20an%20information-theoretic%20model%20and%20its%20computational%20realization%22">

http://scholar.google.com/scholar?q=author:%22Resnik%22%20intitle:%22Selectional%20constraints:%20an%20information-theoretic%20model%20and%20its%20computational%20realization%22</a> </div> The model should be very straightforward to implement, and, in fact, I can probably dig up some old tgrep query expressions that will allow you to pull out verb-object and verb-subject triples from Penn Treebank constituency trees in order to estimate the model parameters.

<br><br>This thread has been followed in various directions in the subsequent computational linguistics literature.  Two of the most useful references (esp. for pointers to related work) might be <br><br><div style="margin-left: 40px;">

Marc Light and Warran Greiff, Statistical models for the induction and use of selectional preferences, Cognitive Science, 2002, Vol. 26, No. 3, Pages 269-281<br></div><div style="margin-left: 40px;"><a href="http://www.mitre.org/work/tech_papers/tech_papers_02/greiff_statistical/greiff_statistical.pdf">

http://www.mitre.org/work/tech_papers/tech_papers_02/greiff_statistical/greiff_statistical.pdf</a><br></div><br>and <br><br><div style="margin-left: 40px;">

Brockmann, C. and Lapata, M. 2003. Evaluating and combining approaches to selectional preference acquisition. In <i>Proceedings of the Tenth Conference on European Chapter of the Association For Computational Linguistics - Volume 1

</i>

(Budapest, Hungary, April 12 - 17, 2003). European Chapter Meeting of

the ACL. Association for Computational Linguistics, Morristown, NJ,

27-34.<br><a href="http://portal.acm.org/citation.cfm?id=1067813&dl=GUIDE">http://portal.acm.org/citation.cfm?id=1067813&dl=GUIDE</a>,<br></div><br>Hope this is helpful,<br><br>  Philip<br><br><br><div><span class="gmail_quote">

On 1/22/07, <b class="gmail_sendername">Jim Magnuson</b> <<a href="mailto:james.magnuson@uconn.edu">james.magnuson@uconn.edu</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I'm a psycholinguist rather than a computational linguist, with a "newbie" question. For some experiments, we need agent-verb-patient triples where the "goodness" of the agents and patients to the verb vary in strength.

<br>Typical ways to develop materials for such studies is by having human<br>subjects rate how "good" various items are as agents and patients for<br>particular verbs (e.g., "how likely is a dog to walk?", "how likely

<br>is a dog to be walked?"). While this works well, it's of course very<br>labor (and subject) intensive. So I'm hoping to automate this.<br><br>I'm looking for recommendations for parsed corpora and tools to use

<br>(with the goal of getting this going ASAP).<br><br>I know about the Penn Treebank; are there better and/or less<br>expensive options for US English, or is this just the way to go?<br><br>I'm an okay perl programmer, and computer savvy; are there tools that

<br>would be helpful?<br><br>Thanks  very much,<br><br>jim<br><br><br></blockquote></div><br>