[Corpora-List] Question about evaluation

Alexander Osherenko osherenko at gmx.de
Mon Dec 3 08:01:38 UTC 2012


Dear Emad,

you would need a measure averaged over classes --  for example for the
recall value, the number of correctly classified instances divided by the
overall number of instances.

Alexander

2012/12/2 Emad Mohamed <emohamed at umail.iu.edu>

> Hello Corpora members,
> I have a corpus of 80,000 words in which each word is assigned either the
> class S or the class E. Class S occurs 72,000 times while class E occurs
> 8,000 times only.
> I'm wondering what the best way to evaluate the classifier performance
> should be. I have randomly selected a dev set (5%) and a test set (10%).
> I'm mainly interested in predicting which words are class E.
>
> I've read this page:
> webdocs.cs.ualberta.ca/~eisner/measures.html
> but I'm still a little bit confused. Do we use specificity in linguistics
> papers? Should I report these measures for each of the two classes or a as
> a general number? Does this make sense / a difference?
>
> Thank you so much.
>
> --
> Emad Mohamed
> aka Emad Nawfal
> Université du Québec à Montréal
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
Alexander Osherenko
Dr. rer. nat, CEO and R&D
<http://www.socioware.de/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121203/d4e092b5/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list