4,165 research outputs found
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
We propose a novel framework called Semantics-Preserving Adversarial
Embedding Network (SP-AEN) for zero-shot visual recognition (ZSL), where test
images and their classes are both unseen during training. SP-AEN aims to tackle
the inherent problem --- semantic loss --- in the prevailing family of
embedding-based ZSL, where some semantics would be discarded during training if
they are non-discriminative for training classes, but could become critical for
recognizing test classes. Specifically, SP-AEN prevents the semantic loss by
introducing an independent visual-to-semantic space embedder which disentangles
the semantic space into two subspaces for the two arguably conflicting
objectives: classification and reconstruction. Through adversarial learning of
the two subspaces, SP-AEN can transfer the semantics from the reconstructive
subspace to the discriminative one, accomplishing the improved zero-shot
recognition of unseen classes. Comparing with prior works, SP-AEN can not only
improve classification but also generate photo-realistic images, demonstrating
the effectiveness of semantic preservation. On four popular benchmarks: CUB,
AWA, SUN and aPY, SP-AEN considerably outperforms other state-of-the-art
methods by an absolute performance difference of 12.2\%, 9.3\%, 4.0\%, and
3.6\% in terms of harmonic mean value
f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning
When labeled training data is scarce, a promising data augmentation approach
is to generate visual features of unknown classes using their attributes. To
learn the class conditional distribution of CNN features, these models rely on
pairs of image features and class attributes. Hence, they can not make use of
the abundance of unlabeled data samples. In this paper, we tackle any-shot
learning problems i.e. zero-shot and few-shot, in a unified feature generating
framework that operates in both inductive and transductive learning settings.
We develop a conditional generative model that combines the strength of VAE and
GANs and in addition, via an unconditional discriminator, learns the marginal
feature distribution of unlabeled images. We empirically show that our model
learns highly discriminative CNN features for five datasets, i.e. CUB, SUN, AWA
and ImageNet, and establish a new state-of-the-art in any-shot learning, i.e.
inductive and transductive (generalized) zero- and few-shot learning settings.
We also demonstrate that our learned features are interpretable: we visualize
them by inverting them back to the pixel space and we explain them by
generating textual arguments of why they are associated with a certain label.Comment: Accepted at CVPR 201
- …