12,681 research outputs found
Hierarchical Meta Learning
Meta learning is a promising solution to few-shot learning problems. However,
existing meta learning methods are restricted to the scenarios where training
and application tasks share the same out-put structure. To obtain a meta model
applicable to the tasks with new structures, it is required to collect new
training data and repeat the time-consuming meta training procedure. This makes
them inefficient or even inapplicable in learning to solve heterogeneous
few-shot learning tasks. We thus develop a novel and principled
HierarchicalMeta Learning (HML) method. Different from existing methods that
only focus on optimizing the adaptability of a meta model to similar tasks, HML
also explicitly optimizes its generalizability across heterogeneous tasks. To
this end, HML first factorizes a set of similar training tasks into
heterogeneous ones and trains the meta model over them at two levels to
maximize adaptation and generalization performance respectively. The resultant
model can then directly generalize to new tasks. Extensive experiments on
few-shot classification and regression problems clearly demonstrate the
superiority of HML over fine-tuning and state-of-the-art meta learning
approaches in terms of generalization across heterogeneous tasks
Deep Multiple Instance Learning for Zero-shot Image Tagging
In-line with the success of deep learning on traditional recognition problem,
several end-to-end deep models for zero-shot recognition have been proposed in
the literature. These models are successful to predict a single unseen label
given an input image, but does not scale to cases where multiple unseen objects
are present. In this paper, we model this problem within the framework of
Multiple Instance Learning (MIL). To the best of our knowledge, we propose the
first end-to-end trainable deep MIL framework for the multi-label zero-shot
tagging problem. Due to its novel design, the proposed framework has several
interesting features: (1) Unlike previous deep MIL models, it does not use any
off-line procedure (e.g., Selective Search or EdgeBoxes) for bag generation.
(2) During test time, it can process any number of unseen labels given their
semantic embedding vectors. (3) Using only seen labels per image as weak
annotation, it can produce a bounding box for each predicted labels. We
experiment with the NUS-WIDE dataset and achieve superior performance across
conventional, zero-shot and generalized zero-shot tagging tasks
TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning
Learning good feature embeddings for images often requires substantial
training data. As a consequence, in settings where training data is limited
(e.g., few-shot and zero-shot learning), we are typically forced to use a
generic feature embedding across various tasks. Ideally, we want to construct
feature embeddings that are tuned for the given task. In this work, we propose
Task-Aware Feature Embedding Networks (TAFE-Nets) to learn how to adapt the
image representation to a new task in a meta learning fashion. Our network is
composed of a meta learner and a prediction network. Based on a task input, the
meta learner generates parameters for the feature layers in the prediction
network so that the feature embedding can be accurately adjusted for that task.
We show that TAFE-Net is highly effective in generalizing to new tasks or
concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and
few-shot learning. Our model matches or exceeds the state-of-the-art on all
tasks. In particular, our approach improves the prediction accuracy of unseen
attribute-object pairs by 4 to 15 points on the challenging visual
attribute-object composition task.Comment: Accepted at CVPR 201
A Simple Neural Attentive Meta-Learner
Deep neural networks excel in regimes with large amounts of data, but tend to
struggle when data is scarce or when they need to adapt quickly to changes in
the task. In response, recent work in meta-learning proposes training a
meta-learner on a distribution of similar tasks, in the hopes of generalization
to novel but related tasks by learning a high-level strategy that captures the
essence of the problem it is asked to solve. However, many recent meta-learning
approaches are extensively hand-designed, either using architectures
specialized to a particular application, or hard-coding algorithmic components
that constrain how the meta-learner solves the task. We propose a class of
simple and generic meta-learner architectures that use a novel combination of
temporal convolutions and soft attention; the former to aggregate information
from past experience and the latter to pinpoint specific pieces of information.
In the most extensive set of meta-learning experiments to date, we evaluate the
resulting Simple Neural AttentIve Learner (or SNAIL) on several
heavily-benchmarked tasks. On all tasks, in both supervised and reinforcement
learning, SNAIL attains state-of-the-art performance by significant margins.Comment: iclr 2018 versio
Diverse Few-Shot Text Classification with Multiple Metrics
We study few-shot learning in natural language domains. Compared to many
existing works that apply either metric-based or optimization-based
meta-learning to image domain with low inter-task variance, we consider a more
realistic setting, where tasks are diverse. However, it imposes tremendous
difficulties to existing state-of-the-art metric-based algorithms since a
single metric is insufficient to capture complex task variations in natural
language domain. To alleviate the problem, we propose an adaptive metric
learning approach that automatically determines the best weighted combination
from a set of metrics obtained from meta-training tasks for a newly seen
few-shot task. Extensive quantitative evaluations on real-world sentiment
analysis and dialog intent classification datasets demonstrate that the
proposed method performs favorably against state-of-the-art few shot learning
algorithms in terms of predictive accuracy. We make our code and data available
for further study.Comment: NAACL 2018. 11+5 pages. arXiv admin note: text overlap with
arXiv:1708.0791
Multi-Label Zero-Shot Learning with Transfer-Aware Label Embedding Projection
Zero-shot learning transfers knowledge from seen classes to novel unseen
classes to reduce human labor of labelling data for building new classifiers.
Much effort on zero-shot learning however has focused on the standard
multi-class setting, the more challenging multi-label zero-shot problem has
received limited attention. In this paper we propose a transfer-aware embedding
projection approach to tackle multi-label zero-shot learning. The approach
projects the label embedding vectors into a low-dimensional space to induce
better inter-label relationships and explicitly facilitate information transfer
from seen labels to unseen labels, while simultaneously learning a max-margin
multi-label classifier with the projected label embeddings. Auxiliary
information can be conveniently incorporated to guide the label embedding
projection to further improve label relation structures for zero-shot knowledge
transfer. We conduct experiments for zero-shot multi-label image
classification. The results demonstrate the efficacy of the proposed approach
Adaptive Cross-Modal Few-Shot Learning
Metric-based meta-learning techniques have successfully been applied to
few-shot classification problems. In this paper, we propose to leverage
cross-modal information to enhance metric-based few-shot learning methods.
Visual and semantic feature spaces have different structures by definition. For
certain concepts, visual features might be richer and more discriminative than
text ones. While for others, the inverse might be true. Moreover, when the
support from visual information is limited in image classification, semantic
representations (learned from unsupervised text corpora) can provide strong
prior knowledge and context to help learning. Based on these two intuitions, we
propose a mechanism that can adaptively combine information from both
modalities according to new image categories to be learned. Through a series of
experiments, we show that by this adaptive combination of the two modalities,
our model outperforms current uni-modality few-shot learning methods and
modality-alignment methods by a large margin on all benchmarks and few-shot
scenarios tested. Experiments also show that our model can effectively adjust
its focus on the two modalities. The improvement in performance is particularly
large when the number of shots is very small
Semi-supervised Domain Adaptation via Minimax Entropy
Contemporary domain adaptation methods are very effective at aligning feature
distributions of source and target domains without any target supervision.
However, we show that these techniques perform poorly when even a few labeled
examples are available in the target. To address this semi-supervised domain
adaptation (SSDA) setting, we propose a novel Minimax Entropy (MME) approach
that adversarially optimizes an adaptive few-shot model. Our base model
consists of a feature encoding network, followed by a classification layer that
computes the features' similarity to estimated prototypes (representatives of
each class). Adaptation is achieved by alternately maximizing the conditional
entropy of unlabeled target data with respect to the classifier and minimizing
it with respect to the feature encoder. We empirically demonstrate the
superiority of our method over many baselines, including conventional feature
alignment and few-shot methods, setting a new state of the art for SSDA.Comment: accepted to ICCV2019. ICCV paper versio
AMS-SFE: Towards an Alignment of Manifold Structures via Semantic Feature Expansion for Zero-shot Learning
Zero-shot learning (ZSL) aims at recognizing unseen classes with knowledge
transferred from seen classes. This is typically achieved by exploiting a
semantic feature space (FS) shared by both seen and unseen classes, i.e.,
attributes or word vectors, as the bridge. However, due to the mutually
disjoint of training (seen) and testing (unseen) data, existing ZSL methods
easily and commonly suffer from the domain shift problem. To address this
issue, we propose a novel model called AMS-SFE. It considers the Alignment of
Manifold Structures by Semantic Feature Expansion. Specifically, we build up an
autoencoder based model to expand the semantic features and joint with an
alignment to an embedded manifold extracted from the visual FS of data. It is
the first attempt to align these two FSs by way of expanding semantic features.
Extensive experiments show the remarkable performance improvement of our model
compared with other existing methods
When Autonomous Systems Meet Accuracy and Transferability through AI: A Survey
With widespread applications of artificial intelligence (AI), the
capabilities of the perception, understanding, decision-making and control for
autonomous systems have improved significantly in the past years. When
autonomous systems consider the performance of accuracy and transferability,
several AI methods, like adversarial learning, reinforcement learning (RL) and
meta-learning, show their powerful performance. Here, we review the
learning-based approaches in autonomous systems from the perspectives of
accuracy and transferability. Accuracy means that a well-trained model shows
good results during the testing phase, in which the testing set shares a same
task or a data distribution with the training set. Transferability means that
when a well-trained model is transferred to other testing domains, the accuracy
is still good. Firstly, we introduce some basic concepts of transfer learning
and then present some preliminaries of adversarial learning, RL and
meta-learning. Secondly, we focus on reviewing the accuracy or transferability
or both of them to show the advantages of adversarial learning, like generative
adversarial networks (GANs), in typical computer vision tasks in autonomous
systems, including image style transfer, image superresolution, image
deblurring/dehazing/rain removal, semantic segmentation, depth estimation,
pedestrian detection and person re-identification (re-ID). Then, we further
review the performance of RL and meta-learning from the aspects of accuracy or
transferability or both of them in autonomous systems, involving pedestrian
tracking, robot navigation and robotic manipulation. Finally, we discuss
several challenges and future topics for using adversarial learning, RL and
meta-learning in autonomous systems
- …