714 research outputs found
f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning
When labeled training data is scarce, a promising data augmentation approach
is to generate visual features of unknown classes using their attributes. To
learn the class conditional distribution of CNN features, these models rely on
pairs of image features and class attributes. Hence, they can not make use of
the abundance of unlabeled data samples. In this paper, we tackle any-shot
learning problems i.e. zero-shot and few-shot, in a unified feature generating
framework that operates in both inductive and transductive learning settings.
We develop a conditional generative model that combines the strength of VAE and
GANs and in addition, via an unconditional discriminator, learns the marginal
feature distribution of unlabeled images. We empirically show that our model
learns highly discriminative CNN features for five datasets, i.e. CUB, SUN, AWA
and ImageNet, and establish a new state-of-the-art in any-shot learning, i.e.
inductive and transductive (generalized) zero- and few-shot learning settings.
We also demonstrate that our learned features are interpretable: we visualize
them by inverting them back to the pixel space and we explain them by
generating textual arguments of why they are associated with a certain label.Comment: Accepted at CVPR 201
Transductive Multi-View Zero-Shot Learning
(c) 2012. The copyright of this document resides with its authors.
It may be distributed unchanged freely in print or electronic forms
Transductive Multi-label Zero-shot Learning
Zero-shot learning has received increasing interest as a means to alleviate
the often prohibitive expense of annotating training data for large scale
recognition problems. These methods have achieved great success via learning
intermediate semantic representations in the form of attributes and more
recently, semantic word vectors. However, they have thus far been constrained
to the single-label case, in contrast to the growing popularity and importance
of more realistic multi-label data. In this paper, for the first time, we
investigate and formalise a general framework for multi-label zero-shot
learning, addressing the unique challenge therein: how to exploit multi-label
correlation at test time with no training data for those classes? In
particular, we propose (1) a multi-output deep regression model to project an
image into a semantic word space, which explicitly exploits the correlations in
the intermediate semantic layer of word vectors; (2) a novel zero-shot learning
algorithm for multi-label data that exploits the unique compositionality
property of semantic word vector representations; and (3) a transductive
learning strategy to enable the regression model learned from seen classes to
generalise well to unseen classes. Our zero-shot learning experiments on a
number of standard multi-label datasets demonstrate that our method outperforms
a variety of baselines.Comment: 12 pages, 6 figures, Accepted to BMVC 2014 (oral
- …