[Corpora-List] agent and patient probabilities

Tue Jan 23 23:00:40 UTC 2007

I worked on this topic some time back (wow, it's *quite* some time back now,
sheesh...) developing a computational model of verb-argument selectional
preferences and validating it using comparisons with human ratings of just
the kind you describe:

Philip Resnik,  `Selectional Constraints: An Information-Theoretic Model and
its Computational Realization'', Cognition, 61:127-159, November 1996.
http://scholar.google.com/scholar?q=author:%22Resnik%22%20intitle:%22Selectional%20constraints:%20an%20information-theoretic%20model%20and%20its%20computational%20realization%22

The model should be very straightforward to implement, and, in fact, I can
probably dig up some old tgrep query expressions that will allow you to pull
out verb-object and verb-subject triples from Penn Treebank constituency
trees in order to estimate the model parameters.

This thread has been followed in various directions in the subsequent
computational linguistics literature.  Two of the most useful references
(esp. for pointers to related work) might be

Marc Light and Warran Greiff, Statistical models for the induction and use
of selectional preferences, Cognitive Science, 2002, Vol. 26, No. 3, Pages
269-281
http://www.mitre.org/work/tech_papers/tech_papers_02/greiff_statistical/greiff_statistical.pdf

and

Brockmann, C. and Lapata, M. 2003. Evaluating and combining approaches to
selectional preference acquisition. In *Proceedings of the Tenth Conference
on European Chapter of the Association For Computational Linguistics -
Volume 1* (Budapest, Hungary, April 12 - 17, 2003). European Chapter Meeting
of the ACL. Association for Computational Linguistics, Morristown, NJ,
27-34.
http://portal.acm.org/citation.cfm?id=1067813&dl=GUIDE,

Hope this is helpful,

  Philip

On 1/22/07, Jim Magnuson <james.magnuson at uconn.edu> wrote:
>
> I'm a psycholinguist rather than a computational linguist, with a
> "newbie" question.
>
> For some experiments, we need agent-verb-patient triples where the
> "goodness" of the agents and patients to the verb vary in strength.
> Typical ways to develop materials for such studies is by having human
> subjects rate how "good" various items are as agents and patients for
> particular verbs (e.g., "how likely is a dog to walk?", "how likely
> is a dog to be walked?"). While this works well, it's of course very
> labor (and subject) intensive. So I'm hoping to automate this.
>
> I'm looking for recommendations for parsed corpora and tools to use
> (with the goal of getting this going ASAP).
>
> I know about the Penn Treebank; are there better and/or less
> expensive options for US English, or is this just the way to go?
>
> I'm an okay perl programmer, and computer savvy; are there tools that
> would be helpful?
>
> Thanks  very much,
>
> jim
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070123/68084acd/attachment.htm>