[Corpora-List] Statistical tests for corpus studies
Adam Kilgarriff
adam.kilgarriff at itri.brighton.ac.uk
Wed May 7 09:45:19 UTC 2003
Josephine,
chi-square will probably not give you what you want, nor will
log-likelihood - my paper on "Comparing Corpora" (Int Jnl Corpus
Linguistics 2001) explains why. Non-parametric tests are more suitable,
I found the Mann-Whitney test did the job well. It involves chopping
each corpus up into same-size slices.
Regards,
Adam
Josephine Lo wrote:
> Dear all,
>
> As a lay-man to statistics, I wish to get some advice on the tests
> suitable for comparing the frequency of a specific type of word in
> corpora of different genre. Having in mind are Chi-square and ANOVA
> but I'm not sure they are the appropriate ones.
>
> Thanks in advance
>
>
> Josephine Lo
> Research Assistant
> Dept. of English and Communication
> City University of Hong Kong
>
--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Adam Kilgarriff
ITRI, University of Brighton tel: (44) 1273 642919
Lewes Road, Brighton BN2 4GJ, UK fax: (44) 1273 642908
adam at itri.bton.ac.uk http://www.itri.bton.ac.uk/~Adam.Kilgarriff
and
Lexicography MasterClass Ltd.
71 Freshfield Road, Brighton BN2 0BL, UK tel: (44) 1273 705773
adam at lexmasterclass.com http://www.lexmasterclass.com
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
More information about the Corpora
mailing list