[Corpora-List] Sentence Initial Constituents
    Adam Kilgarriff 
    adam at lexmasterclass.com
       
    Fri Nov 23 09:53:10 UTC 2007
    
    
  
Tom,
1.	Get a Sketch Engine account (http://www.sketchengine.co.uk ) 
2.	Load your corpora using the CorpusBuilder facility
	  Choose the template to enable TreeTagger
	   (which does sentence-breaking)
3.	Search (in the CQL box under Concordance/Keyword) for 
		<s> ".*"
	(e.g. new sentence followed by any word)
4.	Use the "Frequency" button to get a frequency list of the words
matched
Adam
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Tom Rankin
Sent: 22 November 2007 14:36
To: CORPORA at uib.no
Subject: [Corpora-List] Sentence Initial Constituents
Dear all,
I need to find all the sentence initial constituents in a number of 
smallish corpora (each c. 200k words) - problem is i want to know if 
they' are subjects or some other constituent, i can't use just want a 
list of words. corpora are tagged but not parsed (and aren't likely 
to be). do i just have to do lots of manual sorting of concordances 
with full stops or can someone suggest a better labour saving idea??
cheers
tom
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
    
    
More information about the Corpora
mailing list