Search CORE

18,445 research outputs found

Noise Tolerance under Risk Minimization

Author: Manwani Naresh
Sastry P. S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/10/2012
Field of study

In this paper we explore noise tolerant learning of classifiers. We formulate the problem as follows. We assume that there is an

{\bf unobservable}

training set which is noise-free. The actual training set given to the learning algorithm is obtained from this ideal data set by corrupting the class label of each example. The probability that the class label of an example is corrupted is a function of the feature vector of the example. This would account for most kinds of noisy data one encounters in practice. We say that a learning method is noise tolerant if the classifiers learnt with the ideal noise-free data and with noisy data, both have the same classification accuracy on the noise-free data. In this paper we analyze the noise tolerance properties of risk minimization (under different loss functions), which is a generic method for learning classifiers. We show that risk minimization under 0-1 loss function has impressive noise tolerance properties and that under squared error loss is tolerant only to uniform noise; risk minimization under other loss functions is not noise tolerant. We conclude the paper with some discussion on implications of these theoretical results

arXiv.org e-Print Archive

Crossref

Open Access Repository of IISc Research Publications

Making Risk Minimization Tolerant to Label Noise

Author: Ghosh Aritra
Manwani Naresh
Sastry P. S.
Publication venue: 'Elsevier BV'
Publication date: 10/09/2015
Field of study

In many applications, the training data, from which one needs to learn a classifier, is corrupted with label noise. Many standard algorithms such as SVM perform poorly in presence of label noise. In this paper we investigate the robustness of risk minimization to label noise. We prove a sufficient condition on a loss function for the risk minimization under that loss to be tolerant to uniform label noise. We show that the

0-1

loss, sigmoid loss, ramp loss and probit loss satisfy this condition though none of the standard convex loss functions satisfy it. We also prove that, by choosing a sufficiently large value of a parameter in the loss function, the sigmoid loss, ramp loss and probit loss can be made tolerant to non-uniform label noise also if we can assume the classes to be separable under noise-free data distribution. Through extensive empirical studies, we show that risk minimization under the

0-1

loss, the sigmoid loss and the ramp loss has much better robustness to label noise when compared to the SVM algorithm

arXiv.org e-Print Archive

CiteSeerX

Robust Loss Functions under Label Noise for Deep Neural Networks

Author: Ghosh Aritra
Kumar Himanshu
Sastry P. S.
Publication venue
Publication date: 13/02/2017
Field of study

In many applications of classifier learning, training data suffers from label noise. Deep networks are learned using huge training data where the problem of noisy labels is particularly relevant. The current techniques proposed for learning deep networks under label noise focus on modifying the network architecture and on algorithms for estimating true labels from noisy labels. An alternate approach would be to look for loss functions that are inherently noise-tolerant. For binary classification there exist theoretical results on loss functions that are robust to label noise. In this paper, we provide some sufficient conditions on a loss function so that risk minimization under that loss function would be inherently tolerant to label noise for multiclass classification problems. These results generalize the existing results on noise-tolerant loss functions for binary classification. We study some of the widely used loss functions in deep networks and show that the loss function based on mean absolute value of error is inherently robust to label noise. Thus standard back propagation is enough to learn the true classifier even under label noise. Through experiments, we illustrate the robustness of risk minimization with such loss functions for learning neural networks.Comment: Appeared in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

An Efficient Approach for Computing Optimal Low-Rank Regularized Inverse Matrices

Author: Chung Julianne
Chung Matthias
Publication venue: 'IOP Publishing'
Publication date: 01/01/2014
Field of study

Standard regularization methods that are used to compute solutions to ill-posed inverse problems require knowledge of the forward model. In many real-life applications, the forward model is not known, but training data is readily available. In this paper, we develop a new framework that uses training data, as a substitute for knowledge of the forward model, to compute an optimal low-rank regularized inverse matrix directly, allowing for very fast computation of a regularized solution. We consider a statistical framework based on Bayes and empirical Bayes risk minimization to analyze theoretical properties of the problem. We propose an efficient rank update approach for computing an optimal low-rank regularized inverse matrix for various error measures. Numerical experiments demonstrate the benefits and potential applications of our approach to problems in signal and image processing.Comment: 24 pages, 11 figure

arXiv.org e-Print Archive

CiteSeerX

Analysis-of-marginal-Tail-Means (ATM): a robust method for discrete black-box optimization

Author: Mak Simon
Wu C. F. Jeff
Publication venue
Publication date: 19/10/2018
Field of study

We present a new method, called Analysis-of-marginal-Tail-Means (ATM), for effective robust optimization of discrete black-box problems. ATM has important applications to many real-world engineering problems (e.g., manufacturing optimization, product design, molecular engineering), where the objective to optimize is black-box and expensive, and the design space is inherently discrete. One weakness of existing methods is that they are not robust: these methods perform well under certain assumptions, but yield poor results when such assumptions (which are difficult to verify in black-box problems) are violated. ATM addresses this via the use of marginal tail means for optimization, which combines both rank-based and model-based methods. The trade-off between rank- and model-based optimization is tuned by first identifying important main effects and interactions, then finding a good compromise which best exploits additive structure. By adaptively tuning this trade-off from data, ATM provides improved robust optimization over existing methods, particularly in problems with (i) a large number of factors, (ii) unordered factors, or (iii) experimental noise. We demonstrate the effectiveness of ATM in simulations and in two real-world engineering problems: the first on robust parameter design of a circular piston, and the second on product family design of a thermistor network

arXiv.org e-Print Archive

FigShare