[Corpora-List] Help in Applying Appropriate Statistical Test and Its Interpretation

Stefan Th. Gries stgries at gmail.com
Tue Jun 29 16:34:21 UTC 2010


> the null hypothesis-testing you discuss here doesn't work in corpus linguistics - for the argument see Language is never ever ever random.<http://kilgarriff.co.uk/Publications/2005-K-lineer.pdf> 2005 *Corpus Linguistics and Linguistic Theory* 1 (2): 263-276.
and cf the following article in that journal for a rebuttal (which
shows that once p-values are corrected for multiple testing as they
should be anyway they sometimes do *exactly* what they're supposed to
do).

> My rule of thumb is: it only counts if the ratio (of normalised frequencies) is greater than/less than a factor of two between two text types
and how is that rule of thumb better than this other little rule of
thumb that's been around in the sciences, the one that I think was
something like "p<0.05"? ;-))) And what is the basis for the proposed
rule of thumb?

(NB: I am not saying effect sizes are unimportant, just that bashing
p-values for a shortcoming they exhibit when they are applied
incorrectly is not exactly useful/wise ...)

STG
--
Stefan Th. Gries
-----------------------------------------------
University of California, Santa Barbara
http://www.linguistics.ucsb.edu/faculty/stgries
-----------------------------------------------

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list