[Corpora-List] Considering Distributions Across Texts

Don Tuggener tuggener at cl.uzh.ch
Mon Mar 3 11:02:26 UTC 2014


Hi Brian,

I'm guessing you're looking for tests that help you identify statistical significance of your query results?
A good starting point may be:
2010f. Gries, Stefan Th. Useful statistics for corpus linguistics. In Aquilino Sánchez & Moisés Almela (eds.), A mosaic of corpus linguistics: selected approaches, 269-291. Frankfurt am Main: Peter Lang.
(http://www.linguistics.ucsb.edu/faculty/stgries/research/overview-research.html)

Best,
Don

On Mon, 03 Mar 2014 11:28:35 +0100
corpora-request at uib.no wrote:

> Message: 3
> Date: Fri, 28 Feb 2014 11:16:11 -0500
> From: Brian Schanding <bschanding at gmail.com>
> Subject: [Corpora-List] Considering Distributions Across Texts
> To: corpora at uib.no
> 
> Hello,
> 
> I'm working on research with learner corpora. My corpora aren't that big
> (approx. 250,000 wds with about 300-400 text files). I wonder what
> research/textbook sources anyone can point me to that discuss the
> importance of considering how many texts in the corpus a language feature
> occurs in (as opposed to merely considering overall frequency of a language
> feature within a corpus).
> 
> Many Thanks!
> Brian

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list