<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<tt>In my efforts to understand preposition behavior, I have
assembled two new corpora: (1) 7500 sentences exemplifying each
preposition sense in TPP (from Oxford, up to 20 each, for 300
preps) and (2) 48,000 sentences constituting a representative
sample for 272 preps drawn from the BNC, with >=250 for 140
preps (these currently not sense-tagged). These corpora add to the
one of 25,000+ created for the SemEval 2007 prep WSD task for the
34 most common preps. The BNC corpus was developed with the aid of
Patrick Hanks, with an intent of extending his corpus pattern
analysis for verbs to preps (particularly to develop ontological
characterizations of prep complements and governors).<br>
<br>
Since analysis of these corpora clearly involves a great deal of
work, I want to make them available to the wider community in the
hopes of making more rapid progress in characterizing prep
behavior. I am trying to use the considerable amount of
lexicographic work used in TPP, taking into account how these data
might be linked to FrameNet's frame elements (e.g., the FE
taxonomy) and to other substantial lexical resources (WordNet,
VerbNet, and PropBank). I envision the need for appropriate ML
technologies, dependency parsing, and linguistic insights. It is
my hope that this work would contribute substantially to research
in such NLP areas as QA, Summarization, and RTE.<br>
<br>
More details are available at my web site on <a
href="http://www.clres.com/prepositions.html">TPP</a>, the <a
href="http://www.clres.com/cgi-bin/onlineTPP/find_prep.cgi">Online
TPP</a>, <a
href="http://www.clres.com/online-papers/NextTPPSteps.pdf">next
steps for TPP</a>, and <a
href="http://www.clres.com/online-papers/CPAPreps.pdf">corpus
pattern analysis for preps</a>. I am working to bring this
scattered material, along with the corpora, to an easily
accessible repository. In the meantime, please direct your
comments and inquiries to me.<br>
<br>
Ken Litkowski<br>
</tt>
<pre class="moz-signature" cols="72">--
Ken Litkowski TEL.: 301-482-0237
CL Research EMAIL: <a class="moz-txt-link-abbreviated" href="mailto:ken@clres.com">ken@clres.com</a>
9208 Gue Road Home Page: <a class="moz-txt-link-freetext" href="http://www.clres.com">http://www.clres.com</a>
Damascus, MD 20872-1025 USA Blog: <a class="moz-txt-link-freetext" href="http://www.clres.com/blog">http://www.clres.com/blog</a>
</pre>
</body>
</html>