63,679 research outputs found
Low-shot learning with large-scale diffusion
This paper considers the problem of inferring image labels from images when
only a few annotated examples are available at training time. This setup is
often referred to as low-shot learning, where a standard approach is to
re-train the last few layers of a convolutional neural network learned on
separate classes for which training examples are abundant. We consider a
semi-supervised setting based on a large collection of images to support label
propagation. This is possible by leveraging the recent advances on large-scale
similarity graph construction.
We show that despite its conceptual simplicity, scaling label propagation up
to hundred millions of images leads to state of the art accuracy in the
low-shot learning regime
A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels
The recent success of deep neural networks is powered in part by large-scale
well-labeled training data. However, it is a daunting task to laboriously
annotate an ImageNet-like dateset. On the contrary, it is fairly convenient,
fast, and cheap to collect training images from the Web along with their noisy
labels. This signifies the need of alternative approaches to training deep
neural networks using such noisy labels. Existing methods tackling this problem
either try to identify and correct the wrong labels or reweigh the data terms
in the loss function according to the inferred noisy rates. Both strategies
inevitably incur errors for some of the data points. In this paper, we contend
that it is actually better to ignore the labels of some of the data points than
to keep them if the labels are incorrect, especially when the noisy rate is
high. After all, the wrong labels could mislead a neural network to a bad local
optimum. We suggest a two-stage framework for the learning from noisy labels.
In the first stage, we identify a small portion of images from the noisy
training set of which the labels are correct with a high probability. The noisy
labels of the other images are ignored. In the second stage, we train a deep
neural network in a semi-supervised manner. This framework effectively takes
advantage of the whole training set and yet only a portion of its labels that
are most likely correct. Experiments on three datasets verify the effectiveness
of our approach especially when the noisy rate is high
Curriculum semi-supervised segmentation
This study investigates a curriculum-style strategy for semi-supervised CNN
segmentation, which devises a regression network to learn image-level
information such as the size of a target region. These regressions are used to
effectively regularize the segmentation network, constraining softmax
predictions of the unlabeled images to match the inferred label distributions.
Our framework is based on inequality constraints that tolerate uncertainties
with inferred knowledge, e.g., regressed region size, and can be employed for a
large variety of region attributes. We evaluated our proposed strategy for left
ventricle segmentation in magnetic resonance images (MRI), and compared it to
standard proposal-based semi-supervision strategies. Our strategy leverages
unlabeled data in more efficiently, and achieves very competitive results,
approaching the performance of full-supervision.Comment: Accepted as paper as MICCAI 2O1
Exhaustive and Efficient Constraint Propagation: A Semi-Supervised Learning Perspective and Its Applications
This paper presents a novel pairwise constraint propagation approach by
decomposing the challenging constraint propagation problem into a set of
independent semi-supervised learning subproblems which can be solved in
quadratic time using label propagation based on k-nearest neighbor graphs.
Considering that this time cost is proportional to the number of all possible
pairwise constraints, our approach actually provides an efficient solution for
exhaustively propagating pairwise constraints throughout the entire dataset.
The resulting exhaustive set of propagated pairwise constraints are further
used to adjust the similarity matrix for constrained spectral clustering. Other
than the traditional constraint propagation on single-source data, our approach
is also extended to more challenging constraint propagation on multi-source
data where each pairwise constraint is defined over a pair of data points from
different sources. This multi-source constraint propagation has an important
application to cross-modal multimedia retrieval. Extensive results have shown
the superior performance of our approach.Comment: The short version of this paper appears as oral paper in ECCV 201
High-dimensional semi-supervised learning: in search for optimal inference of the mean
We provide a high-dimensional semi-supervised inference framework focused on
the mean and variance of the response. Our data are comprised of an extensive
set of observations regarding the covariate vectors and a much smaller set of
labeled observations where we observe both the response as well as the
covariates. We allow the size of the covariates to be much larger than the
sample size and impose weak conditions on a statistical form of the data. We
provide new estimators of the mean and variance of the response that extend
some of the recent results presented in low-dimensional models. In particular,
at times we will not necessitate consistent estimation of the functional form
of the data. Together with estimation of the population mean and variance, we
provide their asymptotic distribution and confidence intervals where we
showcase gains in efficiency compared to the sample mean and variance. Our
procedure, with minor modifications, is then presented to make important
contributions regarding inference about average treatment effects. We also
investigate the robustness of estimation and coverage and showcase widespread
applicability and generality of the proposed method
- …