[Corpora-List] Question about evaluation

Lars Buitinck L.J.Buitinck at uva.nl
Mon Dec 3 11:33:19 UTC 2012


2012/12/3  <corpora-request at uib.no>:
> Date: Sun, 2 Dec 2012 17:13:55 -0500
> From: Emad Mohamed <emohamed at umail.iu.edu>
> Subject: [Corpora-List] Question about evaluation
> To: "corpora at uib.no" <corpora at uib.no>
>
> Hello Corpora members,
> I have a corpus of 80,000 words in which each word is assigned either the
> class S or the class E. Class S occurs 72,000 times while class E occurs
> 8,000 times only.
> I'm wondering what the best way to evaluate the classifier performance
> should be. I have randomly selected a dev set (5%) and a test set (10%).

The most common evaluation metric for classification with skewed class
distributions is F1-score:

    F1 = 2 * P * R / (P + R)

where P and R are precision and recall, as defined in the webpage you
linked to. Since you only have two classes, you can just call either
one of them "negative" and the other "positive" for the purpose of F1
score; the score comes out the same either way.

Accuracy is another single-figure summary of classifier performance,
but for this kind of problem, it's no good. You can get 90% accuracy
by just predicting S all the time.

> I'm mainly interested in predicting which words are class E.

If you want a more detailed evaluation, you might compute recall and
precision separately in addition to F1 score, with E as the "positive"
class. However, recall and precision each measure the absence of only
one type of error (false positives for precision, false negatives for
recall) while F1 score takes both into account.

HTH!

-- 
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list