Search CORE

131,156 research outputs found

Robust Loss Functions under Label Noise for Deep Neural Networks

Author: Ghosh Aritra
Kumar Himanshu
Sastry P. S.
Publication venue
Publication date: 13/02/2017
Field of study

In many applications of classifier learning, training data suffers from label noise. Deep networks are learned using huge training data where the problem of noisy labels is particularly relevant. The current techniques proposed for learning deep networks under label noise focus on modifying the network architecture and on algorithms for estimating true labels from noisy labels. An alternate approach would be to look for loss functions that are inherently noise-tolerant. For binary classification there exist theoretical results on loss functions that are robust to label noise. In this paper, we provide some sufficient conditions on a loss function so that risk minimization under that loss function would be inherently tolerant to label noise for multiclass classification problems. These results generalize the existing results on noise-tolerant loss functions for binary classification. We study some of the widely used loss functions in deep networks and show that the loss function based on mean absolute value of error is inherently robust to label noise. Thus standard back propagation is enough to learn the true classifier even under label noise. Through experiments, we illustrate the robustness of risk minimization with such loss functions for learning neural networks.Comment: Appeared in AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

On loss functions for deep neural networks in classification

Author: Czarnecki Wojciech
Janocha Katarzyna
Publication venue: 'Uniwersytet Jagiellonski - Wydawnictwo Uniwersytetu Jagiellonskiego'
Publication date: 01/01/2016
Field of study

Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design – one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes and many others. While one can find impressively wide spread of various configurations of almost every aspect of the deep nets, one element is, in authors’ opinion, underrepresented – while solving classification problems, vast majority of papers and applications simply use log loss. In this paper we try to investigate how particular choices of loss functions affect deep models and their learning dynamics, as well as resulting classifiers robustness to various effects. We perform experiments on classical datasets, as well as provide some additional, theoretical insights into the problem. In particular we show that L1 and L2 losses are, quite surprisingly, justified classification objectives for deep nets, by providing probabilistic interpretation in terms of expected misclassification. We also introduce two losses which are not typically used as deep nets objectives and show that they are viable alternatives to the existing ones

Jagiellonian Univeristy Repository

On Loss Functions for Deep Neural Networks in Classification

Author
Publication venue: 'Uniwersytet Jagiellonski - Wydawnictwo Uniwersytetu Jagiellonskiego'
Publication date: 01/01/2017
Field of study

Crossref