[Corpora-List] Chi-Square
Marco Baroni
baroni at sslmit.unibo.it
Sun Sep 17 12:17:05 UTC 2006
You can see the comparison of chi-square and log-likelihood ratio in this
famous paper, that I think was very influential in giving the Chi-square
test a bad name:
T. Dunning, "Accurate Methods for the Statistics of Surprise and
Coincidence," Computational Linguistics 19(1), 1993.
http://citeseer.ist.psu.edu/dunning93accurate.html
The paper is quite mathematical, but the basic idea and the empirical
comparison part should be quite clear... (although the alternative to
chi-square should be something like the log-likelihood ratio test, not MI,
that has the same problem of overestimation of the significance of the
co-occurrence of rare words that the chi-square test has...)
Regards,
Marco
More information about the Corpora
mailing list