[Corpora-List] Decision tree : maximise recall over precision
Emmanuel Prochasson
emmanuel.prochasson at univ-nantes.fr
Tue Apr 21 13:38:36 UTC 2009
Dear all,
I would like to build a decision tree (or whatever supervised classifier
relevant) on a set of data containing 0.1% "Yes" and 99.9% "No", using
several attributes (12 for now, but I have to tune that). I use Weka,
which is totally awesome.
My goal is to prune search space for another application (ie : remove
say, 80% of the data that are very unlikely to be "Yes"), that's why I'm
trying to use a decision tree. Of course some algorithm returns a 1 leaf
node tree tagged "No", with a 99.9% precision, which is pretty accurate,
but ensure I will always withdraw all of my search space rather than
prune it.
My problem is : is there a way (algorithm ? software ?) to build a tree
that will maximise recall (all "Yes" elements tagged "Yes" by the
algorithm). I don't really care about precision (It's ok if many "No"
elements are tagged "Yes" -- I can handle false positive).
In other word, is there a way to build a decision tree under the
constraint of 100% recall ?
I'm not sure I made myself clear, and I'm not sure there are solutions
for my problem.
Regards,
--
Emmanuel
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list