[Corpora-List] Variant verbal government extraction

Adam Kilgarriff adam at lexmasterclass.com
Fri Feb 23 04:23:51 UTC 2007


Mikhail,

 

The algorithm you want is

 

In a large corpus

     For each verb

            Find how often it occurs in pattern <VERB PRONOUN> 

            Find how often it occurs in pattern <VERB to PRONOUN>

            Compute a statistic to see how high both these numbers are,
relative to overall freq of verb

Sort verbs according to the statistic

 

Now you have a starter set for examining which verbs show the behaviour you
want to investigate.

 

All relevant frequencies are available for, eg, the BNC, in the Sketch
Engine http://www.sketchengine.co.uk <http://www.sketchengine.co.uk/>  where
you can define the patterns in CQL (Corpus Query Language from Stuttgart
Uni).  We don't currently have a nice web interface for robots but will have
shortly, in the meantime, ask us and we can set things up to help you (eg by
allowing you robot access  and then you'd need to scrape web pages)

 

Regards,

 

            Adam

 

 

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of Mikhail Kopotev
Sent: 22 February 2007 13:15
Cc: CORPORA at UIB.NO
Subject: [Corpora-List] Variant verbal government extraction

 

Dear all,

does anyone know how to recognize and extract variations of verbal
government such as "to write you/to you' from a corpus?

As far as I am interested in Russian morphosyntactic changes, I would like
you to point me any tools, methods rather than obtained results, concerning
English or any other irrelevant languages. 

Many thanks,

Mikhail Kopotev
Researcher
Department of Slavonic
and Baltic Languages and Literatures
University of Helsinki
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20070223/9f69ae65/attachment.htm>


More information about the Corpora mailing list