[Corpora-List] Corpora Digest, Vol 72, Issue 15

yuyuan97 yuyuan97 at 163.com
Mon Jun 17 11:16:19 UTC 2013


Indeed, if none of the above mentioned approaches work for your purpose.  You can have your corpora POS-tagged (if they are not) and quite conveniently retrieve Verb+ particle with Regular Expressions in text editors like Powergrep or Editpad Pro, or some other Reg-supporting concordancers like AntConc.
From: corpora-request
Date: 2013-06-17 18:00
To: corpora
Subject: Corpora Digest, Vol 72, Issue 15
Today's Topics:

   1. Re:  Looking for corpus tool/software to extract phrasal
      verbs (Phil Gooch)


----------------------------------------------------------------------

Message: 1
Date: Mon, 17 Jun 2013 10:57:54 +0100
From: Phil Gooch <philgooch at gmail.com>
Subject: Re: [Corpora-List] Looking for corpus tool/software to
extract phrasal verbs
To: "corpora at uib.no" <corpora at uib.no>

If you use GATE, you could try the predicate phrase chunker at
https://github.com/philgooch/BioPred--Biomedical-Predicate-VerbGroup-Chunker
although this was developed for biomedical texts, it might work in other
contexts. You can then use the GATE Orthomatcher to extract co-occurrences.

Otherwise, if you are using Python, you could write some rules using the
CliPs library, http://www.clips.ua.ac.be/pages/pattern-search

Phil


On Mon, Jun 17, 2013 at 9:48 AM, LEUNG, Maggie SN [12901991r] <
maggie.sn.leung at connect.polyu.hk> wrote:

>  Dear Corpora-List users,
>
>  I would like to have your recommendation on any corpus linguistics tool
> or software which can be used to automatically extract all co-occurrences
> of any verb + particle combinations (irrespective of whether a particle is
> adverbial or prepositional) from a corpus. The corpus is about 9 million
> words.
>
>
>  Thanks and best regards,
> Maggie
>
>        ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>        * Disclaimer:*
>
> *This message (including any attachments) contains confidential
> information intended for a specific individual and purpose. If you are not
> the intended recipient, you should delete this message and notify the
> sender and the University immediately. Any disclosure, copying, or
> distribution of this message, or the taking of any action based on it, is
> strictly prohibited and may be unlawful.*
>
> *The University specifically denies any responsibility for the accuracy
> or quality of information obtained through University E-mail Facilities.
> Any views and opinions expressed are only those of the author(s) and do not
> necessarily represent those of the University and the University accepts no
> liability whatsoever for any losses or damages incurred or caused to any
> party as a result of the use of such information.*
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 3373 bytes
Desc: not available
URL: <http://www.uib.no/mailman/public/corpora/attachments/20130617/c158f185/attachment.txt>

----------------------------------------------------------------------
Send Corpora mailing list submissions to
corpora at uib.no

To subscribe or unsubscribe via the World Wide Web, visit
http://mailman.uib.no/listinfo/corpora
or, via email, send a message with subject or body 'help' to
corpora-request at uib.no

You can reach the person managing the list at
corpora-owner at uib.no

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Corpora digest..."


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


End of Corpora Digest, Vol 72, Issue 15
***************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130617/a89f2790/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list