4,743 research outputs found
Discriminative Learning of Latent Features for Zero-Shot Recognition
Zero-shot learning (ZSL) aims to recognize unseen image categories by
learning an embedding space between image and semantic representations. For
years, among existing works, it has been the center task to learn the proper
mapping matrices aligning the visual and semantic space, whilst the importance
to learn discriminative representations for ZSL is ignored. In this work, we
retrospect existing methods and demonstrate the necessity to learn
discriminative representations for both visual and semantic instances of ZSL.
We propose an end-to-end network that is capable of 1) automatically
discovering discriminative regions by a zoom network; and 2) learning
discriminative semantic representations in an augmented space introduced for
both user-defined and latent attributes. Our proposed method is tested
extensively on two challenging ZSL datasets, and the experiment results show
that the proposed method significantly outperforms state-of-the-art methods.Comment: CVPR 2018 (Oral
Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-shot Learning
The performance of generative zero-shot methods mainly depends on the quality
of generated features and how well the model facilitates knowledge transfer
between visual and semantic domains. The quality of generated features is a
direct consequence of the ability of the model to capture the several modes of
the underlying data distribution. To address these issues, we propose a new
two-level joint maximization idea to augment the generative network with an
inference network during training which helps our model capture the several
modes of the data and generate features that better represent the underlying
data distribution. This provides strong cross-modal interaction for effective
transfer of knowledge between visual and semantic domains. Furthermore,
existing methods train the zero-shot classifier either on generate synthetic
image features or latent embeddings produced by leveraging representation
learning. In this work, we unify these paradigms into a single model which in
addition to synthesizing image features, also utilizes the representation
learning capabilities of the inference network to provide discriminative
features for the final zero-shot recognition task. We evaluate our approach on
four benchmark datasets i.e. CUB, FLO, AWA1 and AWA2 against several
state-of-the-art methods, and show its performance. We also perform ablation
studies to analyze and understand our method more carefully for the Generalized
Zero-shot Learning task.Comment: Under Submissio
Two-Level Adversarial Visual-Semantic Coupling for Generalized Zero-shot Learning
The performance of generative zero-shot methods mainly depends on the quality of generated features and how well the model facilitates knowledge transfer between visual and semantic domains. The quality of generated features is a direct consequence of the ability of the model to capture the several modes of the underlying data distribution. To address these issues, we propose a new two-level joint maximization idea to augment the generative network with an inference network during training which helps our model capture the several modes of the data and generate features that better represent the underlying data distribution. This provides strong cross-modal interaction for effective transfer of knowledge between visual and semantic domains. Furthermore, existing methods train the zero-shot classifier either on generated synthetic image features or latent embeddings produced by leveraging representation learning. In this work, we unify these paradigms into a single model which, in addition to synthesizing image features, also utilizes the representation learning capabilities of the inference network to provide discriminative features for the final zero-shot recognition task. We evaluate our approach on four benchmark datasets i.e. CUB, FLO, AWA1 and AWA2 against several state-of-the-art methods, and show its performance. We also perform ablation studies to analyze and understand our method more carefully for the Generalized Zero-shot Learning task. © 2021 IEEE
Generating Visual Representations for Zero-Shot Classification
This paper addresses the task of learning an image clas-sifier when some
categories are defined by semantic descriptions only (e.g. visual attributes)
while the others are defined by exemplar images as well. This task is often
referred to as the Zero-Shot classification task (ZSC). Most of the previous
methods rely on learning a common embedding space allowing to compare visual
features of unknown categories with semantic descriptions. This paper argues
that these approaches are limited as i) efficient discrimi-native classifiers
can't be used ii) classification tasks with seen and unseen categories
(Generalized Zero-Shot Classification or GZSC) can't be addressed efficiently.
In contrast , this paper suggests to address ZSC and GZSC by i) learning a
conditional generator using seen classes ii) generate artificial training
examples for the categories without exemplars. ZSC is then turned into a
standard supervised learning problem. Experiments with 4 generative models and
5 datasets experimentally validate the approach, giving state-of-the-art
results on both ZSC and GZSC
Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classification
This paper addresses the task of zero-shot image classification. The key
contribution of the proposed approach is to control the semantic embedding of
images -- one of the main ingredients of zero-shot learning -- by formulating
it as a metric learning problem. The optimized empirical criterion associates
two types of sub-task constraints: metric discriminating capacity and accurate
attribute prediction. This results in a novel expression of zero-shot learning
not requiring the notion of class in the training phase: only pairs of
image/attributes, augmented with a consistency indicator, are given as ground
truth. At test time, the learned model can predict the consistency of a test
image with a given set of attributes , allowing flexible ways to produce
recognition inferences. Despite its simplicity, the proposed approach gives
state-of-the-art results on four challenging datasets used for zero-shot
recognition evaluation.Comment: in ECCV 2016, Oct 2016, amsterdam, Netherlands. 201
Recent Advances in Transfer Learning for Cross-Dataset Visual Recognition: A Problem-Oriented Perspective
This paper takes a problem-oriented perspective and presents a comprehensive
review of transfer learning methods, both shallow and deep, for cross-dataset
visual recognition. Specifically, it categorises the cross-dataset recognition
into seventeen problems based on a set of carefully chosen data and label
attributes. Such a problem-oriented taxonomy has allowed us to examine how
different transfer learning approaches tackle each problem and how well each
problem has been researched to date. The comprehensive problem-oriented review
of the advances in transfer learning with respect to the problem has not only
revealed the challenges in transfer learning for visual recognition, but also
the problems (e.g. eight of the seventeen problems) that have been scarcely
studied. This survey not only presents an up-to-date technical review for
researchers, but also a systematic approach and a reference for a machine
learning practitioner to categorise a real problem and to look up for a
possible solution accordingly
- …