4 research outputs found
Object-Level Representation Learning for Few-Shot Image Classification
Few-shot learning that trains image classifiers over few labeled examples per
category is a challenging task. In this paper, we propose to exploit an
additional big dataset with different categories to improve the accuracy of
few-shot learning over our target dataset. Our approach is based on the
observation that images can be decomposed into objects, which may appear in
images from both the additional dataset and our target dataset. We use the
object-level relation learned from the additional dataset to infer the
similarity of images in our target dataset with unseen categories. Nearest
neighbor search is applied to do image classification, which is a
non-parametric model and thus does not need fine-tuning. We evaluate our
algorithm on two popular datasets, namely Omniglot and MiniImagenet. We obtain
8.5\% and 2.7\% absolute improvements for 5-way 1-shot and 5-way 5-shot
experiments on MiniImagenet, respectively. Source code will be published upon
acceptance
Leveraging Pretrained Image Classifiers for Language-Based Segmentation
Current semantic segmentation models cannot easily generalize to new object
classes unseen during train time: they require additional annotated images and
retraining. We propose a novel segmentation model that injects visual priors
into semantic segmentation architectures, allowing them to segment out new
target labels without retraining. As visual priors, we use the activations of
pretrained image classifiers, which provide noisy indications of the spatial
location of both the target object and distractor objects in the scene. We
leverage language semantics to obtain these activations for a target label
unseen by the classifier. Further experiments show that the visual priors
obtained via language semantics for both relevant and distracting objects are
key to our performance
Revisiting Metric Learning for Few-Shot Image Classification
The goal of few-shot learning is to recognize new visual concepts with just a
few amount of labeled samples in each class. Recent effective metric-based
few-shot approaches employ neural networks to learn a feature similarity
comparison between query and support examples. However, the importance of
feature embedding, i.e., exploring the relationship among training samples, is
neglected. In this work, we present a simple yet powerful baseline for few-shot
classification by emphasizing the importance of feature embedding.
Specifically, we revisit the classical triplet network from deep metric
learning, and extend it into a deep K-tuplet network for few-shot learning,
utilizing the relationship among the input samples to learn a general
representation learning via episode-training. Once trained, our network is able
to extract discriminative features for unseen novel categories and can be
seamlessly incorporated with a non-linear distance metric function to
facilitate the few-shot classification. Our result on the miniImageNet
benchmark outperforms other metric-based few-shot classification methods. More
importantly, when evaluated on completely different datasets (Caltech-101,
CUB-200, Stanford Dogs and Cars) using the model trained with miniImageNet, our
method significantly outperforms prior methods, demonstrating its superior
capability to generalize to unseen classes.Comment: Accept at Neurocomputin
Small Sample Learning in Big Data Era
As a promising area in artificial intelligence, a new learning paradigm,
called Small Sample Learning (SSL), has been attracting prominent research
attention in the recent years. In this paper, we aim to present a survey to
comprehensively introduce the current techniques proposed on this topic.
Specifically, current SSL techniques can be mainly divided into two categories.
The first category of SSL approaches can be called "concept learning", which
emphasizes learning new concepts from only few related observations. The
purpose is mainly to simulate human learning behaviors like recognition,
generation, imagination, synthesis and analysis. The second category is called
"experience learning", which usually co-exists with the large sample learning
manner of conventional machine learning. This category mainly focuses on
learning with insufficient samples, and can also be called small data learning
in some literatures. More extensive surveys on both categories of SSL
techniques are introduced and some neuroscience evidences are provided to
clarify the rationality of the entire SSL regime, and the relationship with
human learning process. Some discussions on the main challenges and possible
future research directions along this line are also presented.Comment: 76 pages, 15 figures, survey of small sample learnin