[Corpora] [Corpora-List] Calculating statistical significant

Sérgio Matos aleixomatos at ua.pt
Mon Nov 10 17:14:54 UTC 2014


If you have the outputs of both systems on each instance, you may try bootstrap resampling, as done here: http://genomebiology.com/2008/9/S2/S2  


--  
Sérgio Matos
IEETA
Universidade de Aveiro



On Monday 10 November 2014 at 17:04, Jacob Eisenstein wrote:

> > That depends very much on what your task looks like.  It might be easiest – and is often done in computational linguistics – to carry out a ten-fold cross-validation and apply a paired t-test to the quality measure of your choice (e.g. F-score).  To be precise, sample A would be the F-scores achieved by Sys 1 across the ten folds, and sample B the F-scores achieved by Sys 2 on _exactly the same_ folds (and in _exactly the same order_).
>  
> This seems overly conservative to me. Suppose there is a lot of variance across the folds, but system 1 does exactly 0.5% better than system 2 on every fold. It seems like what you want to do is a t-test on the difference in performance.
>  
> That said, there are definitely machine learning / stats papers that argue against computing variance across cross-validation folds. I can't find the exact reference I'm thinking of, but the related work section of Demsar (JMLR 2006) seems like a useful starting point.
> http://machinelearning.wustl.edu/mlpapers/paper_files/Demsar06.pdf
>  
> > For a tagging task evaluated in terms of accuracy, you can apply McNemar's test to the output of the two systems.  The samples correspond to all tokens in the test set, and the observed values are (i) whether Sys 1 is correct on this token and (ii) whether Sys 2 is correct on this token.
>  
> One could also apply a sign test in this case, which I personally find easier to understand. The trouble is that you may not have access to Sys 2's outputs on each instance (suppose you only know its reported accuracy); in this case, you can't apply the sign test or McNemar's test.
>  
> -Jacob
>  
>   
> >  
> > Hope this helps,
> > Stefan
> >  
> >  
> >  
> > _______________________________________________
> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> > Corpora mailing list
> > Corpora at uib.no (mailto:Corpora at uib.no)
> > http://mailman.uib.no/listinfo/corpora
>  
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no (mailto:Corpora at uib.no)
> http://mailman.uib.no/listinfo/corpora
>  
>  


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20141110/880ee61a/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list