191 research outputs found
Deep learning with noisy supervision
University of Technology Sydney. Faculty of Engineering and Information Technology.Central to many state-of-the-art classification systems via deep learning is sufficient accurate annotations for training. This is almost the bottleneck of all machine learning algorithms deployed with deep neural networks. The dilemma behind such a phenomenon is essentially the trade-off between the low expensive model design and the low expensive sample collection. For practical purposes to alleviate this issue, learning with noisy supervision is a critical solution in the Big Data era, since the noisily annotated data on the social websites and Amazon Mechanical Turk platforms can be easily acquired. Therefore, in this dissertation, we explore to solve the fundamental problems when training deep neural networks with noisy supervision.
Our first work is to introduce the low expensive noise structure information to overcome the decoupling bias issue existed learning with noise transition. We study the noise effect via a variable whose structure is implicitly aligned by the provided structure knowledge. Specifically, a Bayesian lower bound is deduced as the objective and it naturally degenerates to previous transition models in the case that there is no structure information available. Furthermore, a generative adversarial implementation is given to stably inject the structure information when training deep neural networks. The experimental results show the consistently improvement in the different simulated noises and the real-world scenario.
Our second work targets to substitute the previous ill-posed stochastic approximation to the noise transition with a rigorous stochastic reallocation regarding the confusion matrix. This work discovers the reason that causes the unstable issue in modeling the noise effect by a neural Softmax layer and introduces a Latent Class-Conditional Noise model to overcome it. In addition, a computational effective dynamic label regression method is deduced for optimization, which stochastic trains the deep neural network and safeguards the noise transition estimation. The proposed method achieves the state-of-the-art results on two toy datasets and two large real-world datasets.
The last work aims to alleviate the difficulty that the ideal assumption on the accurate noise transition is usually not fulfilled and the noise could still pollute the classifier in the back-propagation. We specially introduce a quality embedding factor to apportion the reasoning in the backpropagation, yielding a quality-augmented class-conditional noise model. On the network implementation, we elaborately design a contrastive-additive layer to infer the latent variable and deduce a stochastic optimization via reparameterization tricks. The results on a noisy web dataset and a noisy crowdsourcing dataset confirm the superiority of our model in the accuracy and interpretability
Towards Robust Learning with Different Label Noise Distributions
Noisy labels are an unavoidable consequence of labeling processes and
detecting them is an important step towards preventing performance degradations
in Convolutional Neural Networks. Discarding noisy labels avoids a harmful
memorization, while the associated image content can still be exploited in a
semi-supervised learning (SSL) setup. Clean samples are usually identified
using the small loss trick, i.e. they exhibit a low loss. However, we show that
different noise distributions make the application of this trick less
straightforward and propose to continuously relabel all images to reveal a
discriminative loss against multiple distributions. SSL is then applied twice,
once to improve the clean-noisy detection and again for training the final
model. We design an experimental setup based on ImageNet32/64 for better
understanding the consequences of representation learning with differing label
noise distributions and find that non-uniform out-of-distribution noise better
resembles real-world noise and that in most cases intermediate features are not
affected by label noise corruption. Experiments in CIFAR-10/100, ImageNet32/64
and WebVision (real-world noise) demonstrate that the proposed label noise
Distribution Robust Pseudo-Labeling (DRPL) approach gives substantial
improvements over recent state-of-the-art. Code is available at
https://git.io/JJ0PV
Towards Data-centric Graph Machine Learning: Review and Outlook
Data-centric AI, with its primary focus on the collection, management, and
utilization of data to drive AI models and applications, has attracted
increasing attention in recent years. In this article, we conduct an in-depth
and comprehensive review, offering a forward-looking outlook on the current
efforts in data-centric AI pertaining to graph data-the fundamental data
structure for representing and capturing intricate dependencies among massive
and diverse real-life entities. We introduce a systematic framework,
Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of
the graph data lifecycle, including graph data collection, exploration,
improvement, exploitation, and maintenance. A thorough taxonomy of each stage
is presented to answer three critical graph-centric questions: (1) how to
enhance graph data availability and quality; (2) how to learn from graph data
with limited-availability and low-quality; (3) how to build graph MLOps systems
from the graph data-centric view. Lastly, we pinpoint the future prospects of
the DC-GML domain, providing insights to navigate its advancements and
applications.Comment: 42 pages, 9 figure
Learning to Purify Noisy Labels via Meta Soft Label Corrector
Recent deep neural networks (DNNs) can easily overfit to biased training data
with noisy labels. Label correction strategy is commonly used to alleviate this
issue by designing a method to identity suspected noisy labels and then correct
them. Current approaches to correcting corrupted labels usually need certain
pre-defined label correction rules or manually preset hyper-parameters. These
fixed settings make it hard to apply in practice since the accurate label
correction usually related with the concrete problem, training data and the
temporal information hidden in dynamic iterations of training process. To
address this issue, we propose a meta-learning model which could estimate soft
labels through meta-gradient descent step under the guidance of noise-free meta
data. By viewing the label correction procedure as a meta-process and using a
meta-learner to automatically correct labels, we could adaptively obtain
rectified soft labels iteratively according to current training problems
without manually preset hyper-parameters. Besides, our method is model-agnostic
and we can combine it with any other existing model with ease. Comprehensive
experiments substantiate the superiority of our method in both synthetic and
real-world problems with noisy labels compared with current SOTA label
correction strategies.Comment: 12 pages,6 figure
Image Classification with Deep Learning in the Presence of Noisy Labels: A Survey
Image classification systems recently made a giant leap with the advancement
of deep neural networks. However, these systems require an excessive amount of
labeled data to be adequately trained. Gathering a correctly annotated dataset
is not always feasible due to several factors, such as the expensiveness of the
labeling process or difficulty of correctly classifying data, even for the
experts. Because of these practical challenges, label noise is a common problem
in real-world datasets, and numerous methods to train deep neural networks with
label noise are proposed in the literature. Although deep neural networks are
known to be relatively robust to label noise, their tendency to overfit data
makes them vulnerable to memorizing even random noise. Therefore, it is crucial
to consider the existence of label noise and develop counter algorithms to fade
away its adverse effects to train deep neural networks efficiently. Even though
an extensive survey of machine learning techniques under label noise exists,
the literature lacks a comprehensive survey of methodologies centered
explicitly around deep learning in the presence of noisy labels. This paper
aims to present these algorithms while categorizing them into one of the two
subgroups: noise model based and noise model free methods. Algorithms in the
first group aim to estimate the noise structure and use this information to
avoid the adverse effects of noisy labels. Differently, methods in the second
group try to come up with inherently noise robust algorithms by using
approaches like robust losses, regularizers or other learning paradigms
Towards robust learning with different label noise distributions
Noisy labels are an unavoidable consequence of labeling processes and detecting them is an important step towards preventing performance degradations in Convolutional Neural Networks. Discarding noisy labels avoids a harmful memorization, while the associated image content can still be exploited in a semi-supervised learning (SSL) setup. Clean samples are usually identified using the small loss trick, i.e. they exhibit a low loss. However, we show that different noise distributions make the application of this trick less straightforward and propose to continuously relabel all images to reveal a discriminative loss against multiple distributions. SSL is then applied twice, once to improve the clean-noisy detection and again for training the final model. We design an experimental setup based on ImageNet32/64 for better understanding the consequences of representation learning with differing label noise distributions and find that non-uniform out-of-distribution noise better resembles real-world noise and that in most cases intermediate features are not affected by label noise corruption. Experiments in CIFAR-10/100, ImageNet32/64 and WebVision (real-world noise) demonstrate that the proposed label noise Distribution Robust Pseudo-Labeling (DRPL) approach gives substantial improvements over recent state-of-the-art. Code is available at https://git.io/JJ0PV
Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection
Recent studies on learning with noisy labels have shown remarkable
performance by exploiting a small clean dataset. In particular, model agnostic
meta-learning-based label correction methods further improve performance by
correcting noisy labels on the fly. However, there is no safeguard on the label
miscorrection, resulting in unavoidable performance degradation. Moreover,
every training step requires at least three back-propagations, significantly
slowing down the training speed. To mitigate these issues, we propose a robust
and efficient method that learns a label transition matrix on the fly.
Employing the transition matrix makes the classifier skeptical about all the
corrected samples, which alleviates the miscorrection issue. We also introduce
a two-head architecture to efficiently estimate the label transition matrix
every iteration within a single back-propagation, so that the estimated matrix
closely follows the shifting noise distribution induced by label correction.
Extensive experiments demonstrate that our approach shows the best performance
in training efficiency while having comparable or better accuracy than existing
methods.Comment: ECCV202
On information captured by neural networks: connections with memorization and generalization
Despite the popularity and success of deep learning, there is limited
understanding of when, how, and why neural networks generalize to unseen
examples. Since learning can be seen as extracting information from data, we
formally study information captured by neural networks during training.
Specifically, we start with viewing learning in presence of noisy labels from
an information-theoretic perspective and derive a learning algorithm that
limits label noise information in weights. We then define a notion of unique
information that an individual sample provides to the training of a deep
network, shedding some light on the behavior of neural networks on examples
that are atypical, ambiguous, or belong to underrepresented subpopulations. We
relate example informativeness to generalization by deriving nonvacuous
generalization gap bounds. Finally, by studying knowledge distillation, we
highlight the important role of data and label complexity in generalization.
Overall, our findings contribute to a deeper understanding of the mechanisms
underlying neural network generalization.Comment: PhD thesi
- …