68,378 research outputs found
Deep Co-Training for Semi-Supervised Image Recognition
In this paper, we study the problem of semi-supervised image recognition,
which is to learn classifiers using both labeled and unlabeled images. We
present Deep Co-Training, a deep learning based method inspired by the
Co-Training framework. The original Co-Training learns two classifiers on two
views which are data from different sources that describe the same instances.
To extend this concept to deep learning, Deep Co-Training trains multiple deep
neural networks to be the different views and exploits adversarial examples to
encourage view difference, in order to prevent the networks from collapsing
into each other. As a result, the co-trained networks provide different and
complementary information about the data, which is necessary for the
Co-Training framework to achieve good results. We test our method on SVHN,
CIFAR-10/100 and ImageNet datasets, and our method outperforms the previous
state-of-the-art methods by a large margin
3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training
While making a tremendous impact in various fields, deep neural networks
usually require large amounts of labeled data for training which are expensive
to collect in many applications, especially in the medical domain. Unlabeled
data, on the other hand, is much more abundant. Semi-supervised learning
techniques, such as co-training, could provide a powerful tool to leverage
unlabeled data. In this paper, we propose a novel framework, uncertainty-aware
multi-view co-training (UMCT), to address semi-supervised learning on 3D data,
such as volumetric data from medical imaging. In our work, co-training is
achieved by exploiting multi-viewpoint consistency of 3D data. We generate
different views by rotating or permuting the 3D data and utilize asymmetrical
3D kernels to encourage diversified features in different sub-networks. In
addition, we propose an uncertainty-weighted label fusion mechanism to estimate
the reliability of each view's prediction with Bayesian deep learning. As one
view requires the supervision from other views in co-training, our
self-adaptive approach computes a confidence score for the prediction of each
unlabeled sample in order to assign a reliable pseudo label. Thus, our approach
can take advantage of unlabeled data during training. We show the effectiveness
of our proposed semi-supervised method on several public datasets from medical
image segmentation tasks (NIH pancreas & LiTS liver tumor dataset). Meanwhile,
a fully-supervised method based on our approach achieved state-of-the-art
performances on both the LiTS liver tumor segmentation and the Medical
Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value
of our framework, even when fully supervised training is feasible.Comment: Accepted to WACV 202
Local Label Propagation for Large-Scale Semi-Supervised Learning
A significant issue in training deep neural networks to solve supervised
learning tasks is the need for large numbers of labelled datapoints. The goal
of semi-supervised learning is to leverage ubiquitous unlabelled data, together
with small quantities of labelled data, to achieve high task performance.
Though substantial recent progress has been made in developing semi-supervised
algorithms that are effective for comparatively small datasets, many of these
techniques do not scale readily to the large (unlaballed) datasets
characteristic of real-world applications. In this paper we introduce a novel
approach to scalable semi-supervised learning, called Local Label Propagation
(LLP). Extending ideas from recent work on unsupervised embedding learning, LLP
first embeds datapoints, labelled and otherwise, in a common latent space using
a deep neural network. It then propagates pseudolabels from known to unknown
datapoints in a manner that depends on the local geometry of the embedding,
taking into account both inter-point distance and local data density as a
weighting on propagation likelihood. The parameters of the deep embedding are
then trained to simultaneously maximize pseudolabel categorization performance
as well as a metric of the clustering of datapoints within each psuedo-label
group, iteratively alternating stages of network training and label
propagation. We illustrate the utility of the LLP method on the ImageNet
dataset, achieving results that outperform previous state-of-the-art scalable
semi-supervised learning algorithms by large margins, consistently across a
wide variety of training regimes. We also show that the feature representation
learned with LLP transfers well to scene recognition in the Places 205 dataset
Transfer Adaptation Learning: A Decade Survey
The world we see is ever-changing and it always changes with people, things,
and the environment. Domain is referred to as the state of the world at a
certain moment. A research problem is characterized as transfer adaptation
learning (TAL) when it needs knowledge correspondence between different
moments/domains. Conventional machine learning aims to find a model with the
minimum expected risk on test data by minimizing the regularized empirical risk
on the training data, which, however, supposes that the training and test data
share similar joint probability distribution. TAL aims to build models that can
perform tasks of target domain by learning knowledge from a semantic related
but distribution different source domain. It is an energetic research filed of
increasing influence and importance, which is presenting a blowout publication
trend. This paper surveys the advances of TAL methodologies in the past decade,
and the technical challenges and essential problems of TAL have been observed
and discussed with deep insights and new perspectives. Broader solutions of
transfer adaptation learning being created by researchers are identified, i.e.,
instance re-weighting adaptation, feature adaptation, classifier adaptation,
deep network adaptation and adversarial adaptation, which are beyond the early
semi-supervised and unsupervised split. The survey helps researchers rapidly
but comprehensively understand and identify the research foundation, research
status, theoretical limitations, future challenges and under-studied issues
(universality, interpretability, and credibility) to be broken in the field
toward universal representation and safe applications in open-world scenarios.Comment: 26 pages, 4 figure
Scalable Deep Learning Logo Detection
Existing logo detection methods usually consider a small number of logo
classes and limited images per class with a strong assumption of requiring
tedious object bounding box annotations, therefore not scalable to real-world
dynamic applications. In this work, we tackle these challenges by exploring the
webly data learning principle without the need for exhaustive manual labelling.
Specifically, we propose a novel incremental learning approach, called Scalable
Logo Self-co-Learning (SL^2), capable of automatically self-discovering
informative training images from noisy web data for progressively improving
model capability in a cross-model co-learning manner. Moreover, we introduce a
very large (2,190,757 images of 194 logo classes) logo dataset "WebLogo-2M" by
an automatic web data collection and processing method. Extensive comparative
evaluations demonstrate the superiority of the proposed SL^2 method over the
state-of-the-art strongly and weakly supervised detection models and
contemporary webly data learning approaches
Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
Machine learning (ML) algorithms have made a tremendous impact in the field
of medical imaging. While medical imaging datasets have been growing in size, a
challenge for supervised ML algorithms that is frequently mentioned is the lack
of annotated data. As a result, various methods which can learn with less/other
types of supervision, have been proposed. We review semi-supervised, multiple
instance, and transfer learning in medical imaging, both in diagnosis/detection
or segmentation tasks. We also discuss connections between these learning
scenarios, and opportunities for future research.Comment: Submitted to Medical Image Analysi
Training Deep Neural Networks on Noisy Labels with Bootstrapping
Current state-of-the-art deep learning systems for visual object recognition
and detection use purely supervised training with regularization such as
dropout to avoid overfitting. The performance depends critically on the amount
of labeled examples, and in current practice the labels are assumed to be
unambiguous and accurate. However, this assumption often does not hold; e.g. in
recognition, class labels may be missing; in detection, objects in the image
may not be localized; and in general, the labeling may be subjective. In this
work we propose a generic way to handle noisy and incomplete labeling by
augmenting the prediction objective with a notion of consistency. We consider a
prediction consistent if the same prediction is made given similar percepts,
where the notion of similarity is between deep network features computed from
the input data. In experiments we demonstrate that our approach yields
substantial robustness to label noise on several datasets. On MNIST handwritten
digits, we show that our model is robust to label corruption. On the Toronto
Face Database, we show that our model handles well the case of subjective
labels in emotion recognition, achieving state-of-the- art results, and can
also benefit from unlabeled face images with no modification to our method. On
the ILSVRC2014 detection challenge data, we show that our approach extends to
very deep networks, high resolution images and structured outputs, and results
in improved scalable detection
Self-training with Noisy Student improves ImageNet classification
We present Noisy Student Training, a semi-supervised learning approach that
works well even when labeled data is abundant. Noisy Student Training achieves
88.4% top-1 accuracy on ImageNet, which is 2.0% better than the
state-of-the-art model that requires 3.5B weakly labeled Instagram images. On
robustness test sets, it improves ImageNet-A top-1 accuracy from 61.0% to
83.7%, reduces ImageNet-C mean corruption error from 45.7 to 28.3, and reduces
ImageNet-P mean flip rate from 27.8 to 12.2.
Noisy Student Training extends the idea of self-training and distillation
with the use of equal-or-larger student models and noise added to the student
during learning. On ImageNet, we first train an EfficientNet model on labeled
images and use it as a teacher to generate pseudo labels for 300M unlabeled
images. We then train a larger EfficientNet as a student model on the
combination of labeled and pseudo labeled images. We iterate this process by
putting back the student as the teacher. During the learning of the student, we
inject noise such as dropout, stochastic depth, and data augmentation via
RandAugment to the student so that the student generalizes better than the
teacher. Models are available at
https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet.
Code is available at https://github.com/google-research/noisystudent.Comment: CVPR 202
When Semi-Supervised Learning Meets Transfer Learning: Training Strategies, Models and Datasets
Semi-Supervised Learning (SSL) has been proved to be an effective way to
leverage both labeled and unlabeled data at the same time. Recent
semi-supervised approaches focus on deep neural networks and have achieved
promising results on several benchmarks: CIFAR10, CIFAR100 and SVHN. However,
most of their experiments are based on models trained from scratch instead of
pre-trained models. On the other hand, transfer learning has demonstrated its
value when the target domain has limited labeled data. Here comes the intuitive
question: is it possible to incorporate SSL when fine-tuning a pre-trained
model? We comprehensively study how SSL methods starting from pretrained models
perform under varying conditions, including training strategies, architecture
choice and datasets. From this study, we obtain several interesting and useful
observations.
While practitioners have had an intuitive understanding of these
observations, we do a comprehensive emperical analysis and demonstrate that:
(1) the gains from SSL techniques over a fully-supervised baseline are smaller
when trained from a pre-trained model than when trained from random
initialization, (2) when the domain of the source data used to train the
pre-trained model differs significantly from the domain of the target task, the
gains from SSL are significantly higher and (3) some SSL methods are able to
advance fully-supervised baselines (like Pseudo-Label).
We hope our studies can deepen the understanding of SSL research and
facilitate the process of developing more effective SSL methods to utilize
pre-trained models. Code is now available at github.Comment: Technical repor
Data-Efficient Image Recognition with Contrastive Predictive Coding
Human observers can learn to recognize new categories of images from a
handful of examples, yet doing so with artificial ones remains an open
challenge. We hypothesize that data-efficient recognition is enabled by
representations which make the variability in natural signals more predictable.
We therefore revisit and improve Contrastive Predictive Coding, an unsupervised
objective for learning such representations. This new implementation produces
features which support state-of-the-art linear classification accuracy on the
ImageNet dataset. When used as input for non-linear classification with deep
neural networks, this representation allows us to use 2-5x less labels than
classifiers trained directly on image pixels. Finally, this unsupervised
representation substantially improves transfer learning to object detection on
the PASCAL VOC dataset, surpassing fully supervised pre-trained ImageNet
classifiers
- …