[Corpora-List] Chi-Square

Marco Baroni baroni at sslmit.unibo.it
Sun Sep 17 12:17:05 UTC 2006


You can see the comparison of chi-square and log-likelihood ratio in this 
famous paper, that I think was very influential in giving the Chi-square 
test a bad name:

T. Dunning, "Accurate Methods for the Statistics of Surprise and 
Coincidence," Computational Linguistics 19(1), 1993.
http://citeseer.ist.psu.edu/dunning93accurate.html

The paper is quite mathematical, but the basic idea and the empirical 
comparison part should be quite clear... (although the alternative to 
chi-square should be something like the log-likelihood ratio test, not MI, 
that has the same problem of overestimation of the significance of the 
co-occurrence of rare words that the chi-square test has...)


Regards,

Marco



More information about the Corpora mailing list