[Corpora-List] robust statistics

John F. Sowa sowa at bestweb.net
Sat Mar 27 13:35:37 UTC 2010


RO>> There is a difference between "Robust" ad "Nonparametric".
 > In general "Robust" is more apt to handle outliers.

JW> I would say your observation is correct, concerning the
 > nomenclature. However, methods falling into both classes are
 > appropriate (and probably underused) in circumstances when one
 > knows little about the nature/distribution of one's data. That,
 > to me, is the interesting thing. There seem to be plenty of
 > potential outlier-related "fixes" for inherently non-robust
 > (including parametric) methods. These are not so interesting,
 > and probably don't take us very far forward.

The word 'robust' suggests a value judgment that is not always
justified.  It usually means that different observers using a
given method M with a given set of data D will generate the
same or similar results.  That is very nice, if your goal is
to reach agreement among all the observers.

But if your goal is to get better translations of a language
or to answer questions that other systems can't answer,
then you might not want to derive the same results that
all the other systems derive.

When doing statistics, it's important to keep the ultimate
goal in mind.  As Richard Hamming said,

    "The goal of computing is insight, not numbers."

John Sowa




_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list