[Corpora-List] repetitions

Alexander Osherenko osherenko at gmx.de
Wed Nov 29 10:36:43 UTC 2006


Hello!

I try to find out how I can create my dataset with possibly better performance and what attributes I use. I've created a dataset containing randomly chosen attributes that can repeat. Attributes with the same name have identical values. Hence, there is e.g. an attribute that is repeated 3 times.

I took the dataset with the highest absolute number of correct identified instances (identified with SMO), deleted all attribute repetitions and found out that the result is not identical with that before the deletion.

What could be the reason?

Best

Alexander Osherenko
-- 
"Ein Herz für Kinder" - Ihre Spende hilft! Aktion: www.deutschlandsegelt.de
Unser Dankeschön: Ihr Name auf dem Segel der 1. deutschen America's Cup-Yacht!



More information about the Corpora mailing list