[Corpora-List] Decision tree : maximise recall over precision

Eric Atwell eric at comp.leeds.ac.uk
Tue Apr 21 15:20:18 UTC 2009


Enmmanuel,

Surely a good decision procedure is "JUST SAY NO!" - "only" 99.9% accurate! 
I wish PoS-taggers and other text annotation tools were as good!

It sounds like you want to find out how to set a WEKA decision-tree
builder to NOT prune any branches ... this question is better put to 
the WEKA mailing list wekalist at list.scms.waikato.ac.nz - see
https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist to join

Eric Atwell, Leeds University

PS - please let me know if you find the answer - this looks like an
interesting class coursework exercise!


On Tue, 21 Apr 2009, Emmanuel Prochasson wrote:

> Dear all,
>
> I would like to build a decision tree (or whatever supervised classifier
> relevant) on a set of data containing 0.1% "Yes" and 99.9% "No", using
> several attributes (12 for now, but I have to tune that). I use Weka,
> which is totally awesome.
>
> My goal is to prune search space for another application (ie : remove
> say, 80% of the data that are very unlikely to be "Yes"), that's why I'm
> trying to use a decision tree. Of course some algorithm returns a 1 leaf
> node tree tagged "No", with a 99.9% precision, which is pretty accurate,
> but ensure I will always withdraw all of my search space rather than
> prune it.
>
> My problem is : is there a way (algorithm ? software ?) to build a tree
> that will maximise recall (all "Yes" elements tagged "Yes" by the
> algorithm). I don't really care about precision (It's ok if many "No"
> elements are tagged "Yes" -- I can handle false positive).
>
> In other word, is there a way to build a decision tree under the
> constraint of 100% recall ?
>
> I'm not sure I made myself clear, and I'm not sure there are solutions
> for my problem.
>
> Regards,
>
>

-- 
Eric Atwell,
  Senior Lecturer, Language research group, School of Computing,
  Faculty of Engineering, UNIVERSITY OF LEEDS, Leeds LS2 9JT, England
  TEL: 0113-3435430  FAX: 0113-3435468  WWW/email: google Eric Atwell

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list