3,199 research outputs found
A systematic comparison of supervised classifiers
Pattern recognition techniques have been employed in a myriad of industrial,
medical, commercial and academic applications. To tackle such a diversity of
data, many techniques have been devised. However, despite the long tradition of
pattern recognition research, there is no technique that yields the best
classification in all scenarios. Therefore, the consideration of as many as
possible techniques presents itself as an fundamental practice in applications
aiming at high accuracy. Typical works comparing methods either emphasize the
performance of a given algorithm in validation tests or systematically compare
various algorithms, assuming that the practical use of these methods is done by
experts. In many occasions, however, researchers have to deal with their
practical classification tasks without an in-depth knowledge about the
underlying mechanisms behind parameters. Actually, the adequate choice of
classifiers and parameters alike in such practical circumstances constitutes a
long-standing problem and is the subject of the current paper. We carried out a
study on the performance of nine well-known classifiers implemented by the Weka
framework and compared the dependence of the accuracy with their configuration
parameter configurations. The analysis of performance with default parameters
revealed that the k-nearest neighbors method exceeds by a large margin the
other methods when high dimensional datasets are considered. When other
configuration of parameters were allowed, we found that it is possible to
improve the quality of SVM in more than 20% even if parameters are set
randomly. Taken together, the investigation conducted in this paper suggests
that, apart from the SVM implementation, Weka's default configuration of
parameters provides an performance close the one achieved with the optimal
configuration
Robust classification via MOM minimization
We present an extension of Vapnik's classical empirical risk minimizer (ERM)
where the empirical risk is replaced by a median-of-means (MOM) estimator, the
new estimators are called MOM minimizers. While ERM is sensitive to corruption
of the dataset for many classical loss functions used in classification, we
show that MOM minimizers behave well in theory, in the sense that it achieves
Vapnik's (slow) rates of convergence under weak assumptions: data are only
required to have a finite second moment and some outliers may also have
corrupted the dataset.
We propose an algorithm inspired by MOM minimizers. These algorithms can be
analyzed using arguments quite similar to those used for Stochastic Block
Gradient descent. As a proof of concept, we show how to modify a proof of
consistency for a descent algorithm to prove consistency of its MOM version. As
MOM algorithms perform a smart subsampling, our procedure can also help to
reduce substantially time computations and memory ressources when applied to
non linear algorithms.
These empirical performances are illustrated on both simulated and real
datasets
- …