2,043 research outputs found
IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude's Variance Matters
In this work, we study robust deep learning against abnormal training data
from the perspective of example weighting built in empirical loss functions,
i.e., gradient magnitude with respect to logits, an angle that is not
thoroughly studied so far. Consequently, we have two key findings: (1) Mean
Absolute Error (MAE) Does Not Treat Examples Equally. We present new
observations and insightful analysis about MAE, which is theoretically proved
to be noise-robust. First, we reveal its underfitting problem in practice.
Second, we analyse that MAE's noise-robustness is from emphasising on uncertain
examples instead of treating training samples equally, as claimed in prior
work. (2) The Variance of Gradient Magnitude Matters. We propose an effective
and simple solution to enhance MAE's fitting ability while preserving its
noise-robustness. Without changing MAE's overall weighting scheme, i.e., what
examples get higher weights, we simply change its weighting variance
non-linearly so that the impact ratio between two examples are adjusted. Our
solution is termed Improved MAE (IMAE). We prove IMAE's effectiveness using
extensive experiments: image classification under clean labels, synthetic label
noise, and real-world unknown noise. We conclude IMAE is superior to CCE, the
most popular loss for training DNNs.Comment: Updated Version. IMAE for Noise-Robust Learning: Mean Absolute Error
Does Not Treat Examples Equally and Gradient Magnitude's Variance Matters
Code:
\url{https://github.com/XinshaoAmosWang/Improving-Mean-Absolute-Error-against-CCE}.
Please feel free to contact for discussions or implementation problem
Robust Loss Functions under Label Noise for Deep Neural Networks
In many applications of classifier learning, training data suffers from label
noise. Deep networks are learned using huge training data where the problem of
noisy labels is particularly relevant. The current techniques proposed for
learning deep networks under label noise focus on modifying the network
architecture and on algorithms for estimating true labels from noisy labels. An
alternate approach would be to look for loss functions that are inherently
noise-tolerant. For binary classification there exist theoretical results on
loss functions that are robust to label noise. In this paper, we provide some
sufficient conditions on a loss function so that risk minimization under that
loss function would be inherently tolerant to label noise for multiclass
classification problems. These results generalize the existing results on
noise-tolerant loss functions for binary classification. We study some of the
widely used loss functions in deep networks and show that the loss function
based on mean absolute value of error is inherently robust to label noise. Thus
standard back propagation is enough to learn the true classifier even under
label noise. Through experiments, we illustrate the robustness of risk
minimization with such loss functions for learning neural networks.Comment: Appeared in AAAI 201
Learning with Symmetric Label Noise: The Importance of Being Unhinged
Convex potential minimisation is the de facto approach to binary
classification. However, Long and Servedio [2010] proved that under symmetric
label noise (SLN), minimisation of any convex potential over a linear function
class can result in classification performance equivalent to random guessing.
This ostensibly shows that convex losses are not SLN-robust. In this paper, we
propose a convex, classification-calibrated loss and prove that it is
SLN-robust. The loss avoids the Long and Servedio [2010] result by virtue of
being negatively unbounded. The loss is a modification of the hinge loss, where
one does not clamp at zero; hence, we call it the unhinged loss. We show that
the optimal unhinged solution is equivalent to that of a strongly regularised
SVM, and is the limiting solution for any convex potential; this implies that
strong l2 regularisation makes most standard learners SLN-robust. Experiments
confirm the SLN-robustness of the unhinged loss
- …