705 research outputs found

    Confusion Matrix Stability Bounds for Multiclass Classification

    Full text link
    In this paper, we provide new theoretical results on the generalization properties of learning algorithms for multiclass classification problems. The originality of our work is that we propose to use the confusion matrix of a classifier as a measure of its quality; our contribution is in the line of work which attempts to set up and study the statistical properties of new evaluation measures such as, e.g. ROC curves. In the confusion-based learning framework we propose, we claim that a targetted objective is to minimize the size of the confusion matrix C, measured through its operator norm ||C||. We derive generalization bounds on the (size of the) confusion matrix in an extended framework of uniform stability, adapted to the case of matrix valued loss. Pivotal to our study is a very recent matrix concentration inequality that generalizes McDiarmid's inequality. As an illustration of the relevance of our theoretical results, we show how two SVM learning procedures can be proved to be confusion-friendly. To the best of our knowledge, the present paper is the first that focuses on the confusion matrix from a theoretical point of view

    Confusion-Based Online Learning and a Passive-Aggressive Scheme

    No full text
    International audienceThis paper provides the first ---to the best of our knowledge--- analysis of online learning algorithms for multiclass problems when the {\em confusion} matrix is taken as a performance measure. The work builds upon recent and elegant results on noncommutative concentration inequalities, i.e. concentration inequalities that apply to matrices, and, more precisely, to matrix martingales. We do establish generalization bounds for online learning algorithms and show how the theoretical study motivates the proposition of a new confusion-friendly learning procedure. This learning algorithm, called \copa (for COnfusion Passive-Aggressive) is a passive-aggressive learning algorithm; it is shown that the update equations for \copa can be computed analytically and, henceforth, there is no need to recourse to any optimization package to implement it

    Multi-label classification of a hydraulic system using Machine Learning methods

    Get PDF
    In this project, a condition monitoring of a hydraulic system has been developed. The research consisted of a health categorization of the most relevant physical and non-physical elements of the system. The objective has been to use different ML models to classify the state of the elements in each cycle and to be able to know through the information of the features of each cycle whether an element of the system needs to be replaced or not and also find out the work efficiency of each element of study. This research therefore follows a supervised learning analysis in which two types of classifications will be carried out. The first one will be a multiclass classification done with different ML techniques that will try to classify the categories of each label separately, getting to know for each cycle which is the state of the analyzed element. On the other hand, a multilabel analysis will follow. In this case, all labels will be taken, and different performances will be done. The main objective in this chapter will be to elaborate different tests with different ML models in order to see which is the optimal one for this system, and which is the algorithm that should be used to monitor this type of hydraulic system. In addition to these classification analyses, the correlation between the different data will be assessed beforehand in order to verify relationships or coincidences that may be relevan

    Learning with a Wasserstein loss

    Get PDF
    Learning to predict multi-label outputs is challenging, but in many problems there is a natural metric on the outputs that can be used to improve predictions.In this paper we develop a loss function for multi-label learning, based on the Wasserstein distance. The Wasserstein distance provides a natural notion of dissimilarity for probability measures. Although optimizing with respect to the exact Wasserstein distance is costly, recent work has described a regularized approximation that is efficiently computed. We describe an efficient learning algorithm based on this regularization, as well as a novel extension of the Wasserstein distance from probability measures to unnormalized measures. We also describe a statistical learning bound for the loss. The Wasserstein loss can encourage smoothness of the predictions with respect to a chosen metric on the output space. We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn't use the metric

    Deriving Matrix Concentration Inequalities from Kernel Couplings

    Get PDF
    This paper derives exponential tail bounds and polynomial moment inequalities for the spectral norm deviation of a random matrix from its mean value. The argument depends on a matrix extension of Stein's method of exchangeable pairs for concentration of measure, as introduced by Chatterjee. Recent work of Mackey et al. uses these techniques to analyze random matrices with additive structure, while the enhancements in this paper cover a wider class of matrix-valued random elements. In particular, these ideas lead to a bounded differences inequality that applies to random matrices constructed from weakly dependent random variables. The proofs require novel trace inequalities that may be of independent interest.Comment: 29 page
    • …
    corecore