705 research outputs found
Confusion Matrix Stability Bounds for Multiclass Classification
In this paper, we provide new theoretical results on the generalization
properties of learning algorithms for multiclass classification problems. The
originality of our work is that we propose to use the confusion matrix of a
classifier as a measure of its quality; our contribution is in the line of work
which attempts to set up and study the statistical properties of new evaluation
measures such as, e.g. ROC curves. In the confusion-based learning framework we
propose, we claim that a targetted objective is to minimize the size of the
confusion matrix C, measured through its operator norm ||C||. We derive
generalization bounds on the (size of the) confusion matrix in an extended
framework of uniform stability, adapted to the case of matrix valued loss.
Pivotal to our study is a very recent matrix concentration inequality that
generalizes McDiarmid's inequality. As an illustration of the relevance of our
theoretical results, we show how two SVM learning procedures can be proved to
be confusion-friendly. To the best of our knowledge, the present paper is the
first that focuses on the confusion matrix from a theoretical point of view
Confusion-Based Online Learning and a Passive-Aggressive Scheme
International audienceThis paper provides the first ---to the best of our knowledge--- analysis of online learning algorithms for multiclass problems when the {\em confusion} matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and, more precisely, to matrix martingales. We do establish generalization bounds for online learning algorithms and show how the theoretical study motivates the proposition of a new confusion-friendly learning procedure. This learning algorithm, called \copa (for COnfusion Passive-Aggressive) is a passive-aggressive learning algorithm; it is shown that the update equations for \copa can be computed analytically and, henceforth, there is no need to recourse to any optimization package to implement it
Multi-label classification of a hydraulic system using Machine Learning methods
In this project, a condition monitoring of a hydraulic system has been developed. The research consisted of a health categorization of the most relevant physical and non-physical elements of the system. The objective has been to use different ML models to classify the state of the elements in each cycle and to be able to know through the information of the features of each cycle whether an element of the system needs to be replaced or not and also find out the work efficiency of each element of study.
This research therefore follows a supervised learning analysis in which two types of classifications will be carried out. The first one will be a multiclass classification done with different ML techniques that will try to classify the categories of each label separately, getting to know for each cycle which is the state of the analyzed element. On the other hand, a multilabel analysis will follow. In this case, all labels will be taken, and different performances will be done. The main objective in this chapter will be to elaborate different tests with different ML models in order to see which is the optimal one for this system, and which is the algorithm that should be used to monitor this type of hydraulic system. In addition to these classification analyses, the correlation between the different data will be assessed beforehand in order to verify relationships or coincidences that may be relevan
Learning with a Wasserstein loss
Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions.In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric
Deriving Matrix Concentration Inequalities from Kernel Couplings
This paper derives exponential tail bounds and polynomial moment inequalities
for the spectral norm deviation of a random matrix from its mean value. The
argument depends on a matrix extension of Stein's method of exchangeable pairs
for concentration of measure, as introduced by Chatterjee. Recent work of
Mackey et al. uses these techniques to analyze random matrices with additive
structure, while the enhancements in this paper cover a wider class of
matrix-valued random elements. In particular, these ideas lead to a bounded
differences inequality that applies to random matrices constructed from weakly
dependent random variables. The proofs require novel trace inequalities that
may be of independent interest.Comment: 29 page
- …