8,051 research outputs found
Robust Loss Functions under Label Noise for Deep Neural Networks
In many applications of classifier learning, training data suffers from label
noise. Deep networks are learned using huge training data where the problem of
noisy labels is particularly relevant. The current techniques proposed for
learning deep networks under label noise focus on modifying the network
architecture and on algorithms for estimating true labels from noisy labels. An
alternate approach would be to look for loss functions that are inherently
noise-tolerant. For binary classification there exist theoretical results on
loss functions that are robust to label noise. In this paper, we provide some
sufficient conditions on a loss function so that risk minimization under that
loss function would be inherently tolerant to label noise for multiclass
classification problems. These results generalize the existing results on
noise-tolerant loss functions for binary classification. We study some of the
widely used loss functions in deep networks and show that the loss function
based on mean absolute value of error is inherently robust to label noise. Thus
standard back propagation is enough to learn the true classifier even under
label noise. Through experiments, we illustrate the robustness of risk
minimization with such loss functions for learning neural networks.Comment: Appeared in AAAI 201
A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels
The recent success of deep neural networks is powered in part by large-scale
well-labeled training data. However, it is a daunting task to laboriously
annotate an ImageNet-like dateset. On the contrary, it is fairly convenient,
fast, and cheap to collect training images from the Web along with their noisy
labels. This signifies the need of alternative approaches to training deep
neural networks using such noisy labels. Existing methods tackling this problem
either try to identify and correct the wrong labels or reweigh the data terms
in the loss function according to the inferred noisy rates. Both strategies
inevitably incur errors for some of the data points. In this paper, we contend
that it is actually better to ignore the labels of some of the data points than
to keep them if the labels are incorrect, especially when the noisy rate is
high. After all, the wrong labels could mislead a neural network to a bad local
optimum. We suggest a two-stage framework for the learning from noisy labels.
In the first stage, we identify a small portion of images from the noisy
training set of which the labels are correct with a high probability. The noisy
labels of the other images are ignored. In the second stage, we train a deep
neural network in a semi-supervised manner. This framework effectively takes
advantage of the whole training set and yet only a portion of its labels that
are most likely correct. Experiments on three datasets verify the effectiveness
of our approach especially when the noisy rate is high
IMAE for Noise-Robust Learning: Mean Absolute Error Does Not Treat Examples Equally and Gradient Magnitude's Variance Matters
In this work, we study robust deep learning against abnormal training data
from the perspective of example weighting built in empirical loss functions,
i.e., gradient magnitude with respect to logits, an angle that is not
thoroughly studied so far. Consequently, we have two key findings: (1) Mean
Absolute Error (MAE) Does Not Treat Examples Equally. We present new
observations and insightful analysis about MAE, which is theoretically proved
to be noise-robust. First, we reveal its underfitting problem in practice.
Second, we analyse that MAE's noise-robustness is from emphasising on uncertain
examples instead of treating training samples equally, as claimed in prior
work. (2) The Variance of Gradient Magnitude Matters. We propose an effective
and simple solution to enhance MAE's fitting ability while preserving its
noise-robustness. Without changing MAE's overall weighting scheme, i.e., what
examples get higher weights, we simply change its weighting variance
non-linearly so that the impact ratio between two examples are adjusted. Our
solution is termed Improved MAE (IMAE). We prove IMAE's effectiveness using
extensive experiments: image classification under clean labels, synthetic label
noise, and real-world unknown noise. We conclude IMAE is superior to CCE, the
most popular loss for training DNNs.Comment: Updated Version. IMAE for Noise-Robust Learning: Mean Absolute Error
Does Not Treat Examples Equally and Gradient Magnitude's Variance Matters
Code:
\url{https://github.com/XinshaoAmosWang/Improving-Mean-Absolute-Error-against-CCE}.
Please feel free to contact for discussions or implementation problem
EEG-Based Emotion Recognition Using Regularized Graph Neural Networks
Electroencephalography (EEG) measures the neuronal activities in different
brain regions via electrodes. Many existing studies on EEG-based emotion
recognition do not fully exploit the topology of EEG channels. In this paper,
we propose a regularized graph neural network (RGNN) for EEG-based emotion
recognition. RGNN considers the biological topology among different brain
regions to capture both local and global relations among different EEG
channels. Specifically, we model the inter-channel relations in EEG signals via
an adjacency matrix in a graph neural network where the connection and
sparseness of the adjacency matrix are inspired by neuroscience theories of
human brain organization. In addition, we propose two regularizers, namely
node-wise domain adversarial training (NodeDAT) and emotion-aware distribution
learning (EmotionDL), to better handle cross-subject EEG variations and noisy
labels, respectively. Extensive experiments on two public datasets, SEED and
SEED-IV, demonstrate the superior performance of our model than
state-of-the-art models in most experimental settings. Moreover, ablation
studies show that the proposed adjacency matrix and two regularizers contribute
consistent and significant gain to the performance of our RGNN model. Finally,
investigations on the neuronal activities reveal important brain regions and
inter-channel relations for EEG-based emotion recognition
- …