7,355 research outputs found
Confusion Matrix Stability Bounds for Multiclass Classification
In this paper, we provide new theoretical results on the generalization
properties of learning algorithms for multiclass classification problems. The
originality of our work is that we propose to use the confusion matrix of a
classifier as a measure of its quality; our contribution is in the line of work
which attempts to set up and study the statistical properties of new evaluation
measures such as, e.g. ROC curves. In the confusion-based learning framework we
propose, we claim that a targetted objective is to minimize the size of the
confusion matrix C, measured through its operator norm ||C||. We derive
generalization bounds on the (size of the) confusion matrix in an extended
framework of uniform stability, adapted to the case of matrix valued loss.
Pivotal to our study is a very recent matrix concentration inequality that
generalizes McDiarmid's inequality. As an illustration of the relevance of our
theoretical results, we show how two SVM learning procedures can be proved to
be confusion-friendly. To the best of our knowledge, the present paper is the
first that focuses on the confusion matrix from a theoretical point of view
On multi-class learning through the minimization of the confusion matrix norm
In imbalanced multi-class classification problems, the misclassification rate
as an error measure may not be a relevant choice. Several methods have been
developed where the performance measure retained richer information than the
mere misclassification rate: misclassification costs, ROC-based information,
etc. Following this idea of dealing with alternate measures of performance, we
propose to address imbalanced classification problems by using a new measure to
be optimized: the norm of the confusion matrix. Indeed, recent results show
that using the norm of the confusion matrix as an error measure can be quite
interesting due to the fine-grain informations contained in the matrix,
especially in the case of imbalanced classes. Our first contribution then
consists in showing that optimizing criterion based on the confusion matrix
gives rise to a common background for cost-sensitive methods aimed at dealing
with imbalanced classes learning problems. As our second contribution, we
propose an extension of a recent multi-class boosting method --- namely
AdaBoost.MM --- to the imbalanced class problem, by greedily minimizing the
empirical norm of the confusion matrix. A theoretical analysis of the
properties of the proposed method is presented, while experimental results
illustrate the behavior of the algorithm and show the relevancy of the approach
compared to other methods
Multiclass Data Segmentation using Diffuse Interface Methods on Graphs
We present two graph-based algorithms for multiclass segmentation of
high-dimensional data. The algorithms use a diffuse interface model based on
the Ginzburg-Landau functional, related to total variation compressed sensing
and image processing. A multiclass extension is introduced using the Gibbs
simplex, with the functional's double-well potential modified to handle the
multiclass case. The first algorithm minimizes the functional using a convex
splitting numerical scheme. The second algorithm is a uses a graph adaptation
of the classical numerical Merriman-Bence-Osher (MBO) scheme, which alternates
between diffusion and thresholding. We demonstrate the performance of both
algorithms experimentally on synthetic data, grayscale and color images, and
several benchmark data sets such as MNIST, COIL and WebKB. We also make use of
fast numerical solvers for finding the eigenvectors and eigenvalues of the
graph Laplacian, and take advantage of the sparsity of the matrix. Experiments
indicate that the results are competitive with or better than the current
state-of-the-art multiclass segmentation algorithms.Comment: 14 page
Towards minimizing the energy of slack variables for binary classification
This paper presents a binary classification algorithm that is based on the minimization of the energy of slack variables, called the Mean Squared Slack (MSS). A novel kernel extension is proposed which includes the withholding of just a subset of input patterns that are misclassified during training. The later leads to a time and memory efficient system that converges in a few iterations. Two datasets are exploited for performance evaluation, namely the adult and the vertebral column dataset. Experimental results demonstrate the effectiveness of the proposed algorithm with respect to computation time and scalability. Accuracy is also high. In specific, it equals 84.951% for the adult dataset and 91.935%, for the vertebral column dataset, outperforming state-of-the-art methods. © 2012 EURASIP
- …