[Corpora-List] Sentence Initial Constituents
Adam Kilgarriff
adam at lexmasterclass.com
Fri Nov 23 09:53:10 UTC 2007
Tom,
1. Get a Sketch Engine account (http://www.sketchengine.co.uk )
2. Load your corpora using the CorpusBuilder facility
Choose the template to enable TreeTagger
(which does sentence-breaking)
3. Search (in the CQL box under Concordance/Keyword) for
<s> ".*"
(e.g. new sentence followed by any word)
4. Use the "Frequency" button to get a frequency list of the words
matched
Adam
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Tom Rankin
Sent: 22 November 2007 14:36
To: CORPORA at uib.no
Subject: [Corpora-List] Sentence Initial Constituents
Dear all,
I need to find all the sentence initial constituents in a number of
smallish corpora (each c. 200k words) - problem is i want to know if
they' are subjects or some other constituent, i can't use just want a
list of words. corpora are tagged but not parsed (and aren't likely
to be). do i just have to do lots of manual sorting of concordances
with full stops or can someone suggest a better labour saving idea??
cheers
tom
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list