6,664 research outputs found
On robustness properties of convex risk minimization methods for pattern recognition
The paper brings together methods from two disciplines: machine learning theory and robust statistics. Robustness properties of machine learning methods based on convex risk minimization are investigated for the problem of pattern recognition. Assumptions are given for the existence of the influence function of the classifiers and for bounds of the influence function. Kernel logistic regression, support vector machines, least squares and the AdaBoost loss function are treated as special cases. A sensitivity analysis of the support vector machine is given. --AdaBoost loss function,influence function,kernel logistic regression,robustness,sensitivity curve,statistical learning,support vector machine,total variation
On Robustness Properties of Convex Risk Minimization Methods for Pattern Recognition
The paper brings together methods from two disciplines: machine learning theory and robust statistics. Robustness properties of machine learning methods based on convex risk minimization are investigated for the problem of pattern recognition. Assumptions are given for the existence of the influence function of the classifiers and for bounds of the influence function. Kernel logistic regression, support vector machines, least squares and the AdaBoost loss function are treated as special cases. A sensitivity analysis of the support vector machine is given
Making Risk Minimization Tolerant to Label Noise
In many applications, the training data, from which one needs to learn a
classifier, is corrupted with label noise. Many standard algorithms such as SVM
perform poorly in presence of label noise. In this paper we investigate the
robustness of risk minimization to label noise. We prove a sufficient condition
on a loss function for the risk minimization under that loss to be tolerant to
uniform label noise. We show that the loss, sigmoid loss, ramp loss and
probit loss satisfy this condition though none of the standard convex loss
functions satisfy it. We also prove that, by choosing a sufficiently large
value of a parameter in the loss function, the sigmoid loss, ramp loss and
probit loss can be made tolerant to non-uniform label noise also if we can
assume the classes to be separable under noise-free data distribution. Through
extensive empirical studies, we show that risk minimization under the
loss, the sigmoid loss and the ramp loss has much better robustness to label
noise when compared to the SVM algorithm
Regression depth and support vector machine
The regression depth method (RDM) proposed by Rousseeuw and Hubert [RH99] plays an important role in the area of robust regression for a continuous response variable. Christmann and Rousseeuw [CR01] showed that RDM is also useful for the case of binary regression. Vapnik?s convex risk minimization principle [Vap98] has a dominating role in statistical machine learning theory. Important special cases are the support vector machine (SVM), [epsilon]-support vector regression and kernel logistic regression. In this paper connections between these methods from different disciplines are investigated for the case of pattern recognition. Some results concerning the robustness of the SVM and other kernel based methods are given. --
Robust Loss Functions under Label Noise for Deep Neural Networks
In many applications of classifier learning, training data suffers from label
noise. Deep networks are learned using huge training data where the problem of
noisy labels is particularly relevant. The current techniques proposed for
learning deep networks under label noise focus on modifying the network
architecture and on algorithms for estimating true labels from noisy labels. An
alternate approach would be to look for loss functions that are inherently
noise-tolerant. For binary classification there exist theoretical results on
loss functions that are robust to label noise. In this paper, we provide some
sufficient conditions on a loss function so that risk minimization under that
loss function would be inherently tolerant to label noise for multiclass
classification problems. These results generalize the existing results on
noise-tolerant loss functions for binary classification. We study some of the
widely used loss functions in deep networks and show that the loss function
based on mean absolute value of error is inherently robust to label noise. Thus
standard back propagation is enough to learn the true classifier even under
label noise. Through experiments, we illustrate the robustness of risk
minimization with such loss functions for learning neural networks.Comment: Appeared in AAAI 201
Robust Learning from Bites for Data Mining
Some methods from statistical machine learning and from robust statistics have two drawbacks. Firstly, they are computer-intensive such that they can hardly be used for massive data sets, say with millions of data points. Secondly, robust and non-parametric confidence intervals for the predictions according to the fitted models are often unknown. Here, we propose a simple but general method to overcome these problems in the context of huge data sets. The method is scalable to the memory of the computer, can be distributed on several processors if available, and can help to reduce the computation time substantially. Our main focus is on robust general support vector machines (SVM) based on minimizing regularized risks. The method offers distribution-free confidence intervals for the median of the predictions. The approach can also be helpful to fit robust estimators in parametric models for huge data sets. --Breakdown point,convex risk minimization,data mining,distributed computing,influence function,logistic regression,robustness,scalability
- …