14,849 research outputs found
Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions
Learning with limited data is a key challenge for visual recognition. Many
few-shot learning methods address this challenge by learning an instance
embedding function from seen classes and apply the function to instances from
unseen classes with limited labels. This style of transfer learning is
task-agnostic: the embedding function is not learned optimally discriminative
with respect to the unseen classes, where discerning among them leads to the
target task. In this paper, we propose a novel approach to adapt the instance
embeddings to the target classification task with a set-to-set function,
yielding embeddings that are task-specific and are discriminative. We
empirically investigated various instantiations of such set-to-set functions
and observed the Transformer is most effective -- as it naturally satisfies key
properties of our desired model. We denote this model as FEAT (few-shot
embedding adaptation w/ Transformer) and validate it on both the standard
few-shot classification benchmark and four extended few-shot learning settings
with essential use cases, i.e., cross-domain, transductive, generalized
few-shot learning, and low-shot learning. It archived consistent improvements
over baseline models as well as previous methods and established the new
state-of-the-art results on two benchmarks.Comment: Accepted by CVPR 2020; The code is available at
https://github.com/Sha-Lab/FEA
Adapted Deep Embeddings: A Synthesis of Methods for -Shot Inductive Transfer Learning
The focus in machine learning has branched beyond training classifiers on a
single task to investigating how previously acquired knowledge in a source
domain can be leveraged to facilitate learning in a related target domain,
known as inductive transfer learning. Three active lines of research have
independently explored transfer learning using neural networks. In weight
transfer, a model trained on the source domain is used as an initialization
point for a network to be trained on the target domain. In deep metric
learning, the source domain is used to construct an embedding that captures
class structure in both the source and target domains. In few-shot learning,
the focus is on generalizing well in the target domain based on a limited
number of labeled examples. We compare state-of-the-art methods from these
three paradigms and also explore hybrid adapted-embedding methods that use
limited target-domain data to fine tune embeddings constructed from
source-domain data. We conduct a systematic comparison of methods in a variety
of domains, varying the number of labeled instances available in the target
domain (), as well as the number of target-domain classes. We reach three
principal conclusions: (1) Deep embeddings are far superior, compared to weight
transfer, as a starting point for inter-domain transfer or model re-use (2) Our
hybrid methods robustly outperform every few-shot learning and every deep
metric learning method previously proposed, with a mean error reduction of 34%
over state-of-the-art. (3) Among loss functions for discovering embeddings, the
histogram loss (Ustinova & Lempitsky, 2016) is most robust. We hope our results
will motivate a unification of research in weight transfer, deep metric
learning, and few-shot learning
Semi-Supervised and Active Few-Shot Learning with Prototypical Networks
We consider the problem of semi-supervised few-shot classification where a
classifier needs to adapt to new tasks using a few labeled examples and
(potentially many) unlabeled examples. We propose a clustering approach to the
problem. The features extracted with Prototypical Networks are clustered using
-means with the few labeled examples guiding the clustering process. We note
that in many real-world applications the adaptation performance can be
significantly improved by requesting the few labels through user feedback. We
demonstrate good performance of the active adaptation strategy using image
data
Learning to learn via Self-Critique
In few-shot learning, a machine learning system learns from a small set of
labelled examples relating to a specific task, such that it can generalize to
new examples of the same task. Given the limited availability of labelled
examples in such tasks, we wish to make use of all the information we can.
Usually a model learns task-specific information from a small training-set
(support-set) to predict on an unlabelled validation set (target-set). The
target-set contains additional task-specific information which is not utilized
by existing few-shot learning methods. Making use of the target-set examples
via transductive learning requires approaches beyond the current methods; at
inference time, the target-set contains only unlabelled input data-points, and
so discriminative learning cannot be used. In this paper, we propose a
framework called Self-Critique and Adapt or SCA, which learns to learn a
label-free loss function, parameterized as a neural network. A base-model
learns on a support-set using existing methods (e.g. stochastic gradient
descent combined with the cross-entropy loss), and then is updated for the
incoming target-task using the learnt loss function. This label-free loss
function is itself optimized such that the learnt model achieves higher
generalization performance. Experiments demonstrate that SCA offers
substantially reduced error-rates compared to baselines which only adapt on the
support-set, and results in state of the art benchmark performance on
Mini-ImageNet and Caltech-UCSD Birds 200.Comment: Accepted in NeurIPS 201
Few-Shot Adaptation for Multimedia Semantic Indexing
We propose a few-shot adaptation framework, which bridges zero-shot learning
and supervised many-shot learning, for semantic indexing of image and video
data. Few-shot adaptation provides robust parameter estimation with few
training examples, by optimizing the parameters of zero-shot learning and
supervised many-shot learning simultaneously. In this method, first we build a
zero-shot detector, and then update it by using the few examples. Our
experiments show the effectiveness of the proposed framework on three datasets:
TRECVID Semantic Indexing 2010, 2014, and ImageNET. On the ImageNET dataset, we
show that our method outperforms recent few-shot learning methods. On the
TRECVID 2014 dataset, we achieve 15.19% and 35.98% in Mean Average Precision
under the zero-shot condition and the supervised condition, respectively. To
the best of our knowledge, these are the best results on this dataset
A Simple Exponential Family Framework for Zero-Shot Learning
We present a simple generative framework for learning to predict previously
unseen classes, based on estimating class-attribute-gated class-conditional
distributions. We model each class-conditional distribution as an exponential
family distribution and the parameters of the distribution of each seen/unseen
class are defined as functions of the respective observed class attributes.
These functions can be learned using only the seen class data and can be used
to predict the parameters of the class-conditional distribution of each unseen
class. Unlike most existing methods for zero-shot learning that represent
classes as fixed embeddings in some vector space, our generative model
naturally represents each class as a probability distribution. It is simple to
implement and also allows leveraging additional unlabeled data from unseen
classes to improve the estimates of their class-conditional distributions using
transductive/semi-supervised learning. Moreover, it extends seamlessly to
few-shot learning by easily updating these distributions when provided with a
small number of additional labelled examples from unseen classes. Through a
comprehensive set of experiments on several benchmark data sets, we demonstrate
the efficacy of our framework.Comment: Accepted in ECML-PKDD 2017, 16 Pages: Code and Data are available:
https://github.com/vkverma01/Zero-Shot
Hierarchically Structured Meta-learning
In order to learn quickly with few samples, meta-learning utilizes prior
knowledge learned from previous tasks. However, a critical challenge in
meta-learning is task uncertainty and heterogeneity, which can not be handled
via globally sharing knowledge among tasks. In this paper, based on
gradient-based meta-learning, we propose a hierarchically structured
meta-learning (HSML) algorithm that explicitly tailors the transferable
knowledge to different clusters of tasks. Inspired by the way human beings
organize knowledge, we resort to a hierarchical task clustering structure to
cluster tasks. As a result, the proposed approach not only addresses the
challenge via the knowledge customization to different clusters of tasks, but
also preserves knowledge generalization among a cluster of similar tasks. To
tackle the changing of task relationship, in addition, we extend the
hierarchical structure to a continual learning environment. The experimental
results show that our approach can achieve state-of-the-art performance in both
toy-regression and few-shot image classification problems.Comment: ICML 2019; Errata: this version fix the results of A1 in Table 1
Recent Advances in Zero-shot Recognition
With the recent renaissance of deep convolution neural networks, encouraging
breakthroughs have been achieved on the supervised recognition tasks, where
each class has sufficient training data and fully annotated training data.
However, to scale the recognition to a large number of classes with few or now
training samples for each class remains an unsolved problem. One approach to
scaling up the recognition is to develop models capable of recognizing unseen
categories without any training instances, or zero-shot recognition/ learning.
This article provides a comprehensive review of existing zero-shot recognition
techniques covering various aspects ranging from representations of models, and
from datasets and evaluation settings. We also overview related recognition
tasks including one-shot and open set recognition which can be used as natural
extensions of zero-shot recognition when limited number of class samples become
available or when zero-shot recognition is implemented in a real-world setting.
Importantly, we highlight the limitations of existing approaches and point out
future research directions in this existing new research area.Comment: accepted by IEEE Signal Processing Magazin
Adaptive Deep Kernel Learning
Deep kernel learning provides an elegant and principled framework for
combining the structural properties of deep learning algorithms with the
flexibility of kernel methods. By means of a deep neural network, we learn a
parametrized kernel operator that can be combined with a differentiable kernel
algorithm during inference. While previous work within this framework has
focused on learning a single kernel for large datasets, we learn a kernel
family for a variety of few-shot regression tasks. Compared to single deep
kernel learning, our algorithm enables the identification of the appropriate
kernel for each task during inference. As such, it is well adapted for complex
task distributions in a few-shot learning setting, which we demonstrate by
comparing against existing state-of-the-art algorithms using real-world,
few-shot regression tasks related to the field of drug discovery
Learning Robust Visual-Semantic Embeddings
Many of the existing methods for learning joint embedding of images and text
use only supervised information from paired images and its textual attributes.
Taking advantage of the recent success of unsupervised learning in deep neural
networks, we propose an end-to-end learning framework that is able to extract
more robust multi-modal representations across domains. The proposed method
combines representation learning models (i.e., auto-encoders) together with
cross-domain learning criteria (i.e., Maximum Mean Discrepancy loss) to learn
joint embeddings for semantic and visual features. A novel technique of
unsupervised-data adaptation inference is introduced to construct more
comprehensive embeddings for both labeled and unlabeled data. We evaluate our
method on Animals with Attributes and Caltech-UCSD Birds 200-2011 dataset with
a wide range of applications, including zero and few-shot image recognition and
retrieval, from inductive to transductive settings. Empirically, we show that
our framework improves over the current state of the art on many of the
considered tasks.Comment: 12 page
- …