115 research outputs found
Generalized Zero-Shot Learning via Synthesized Examples
We present a generative framework for generalized zero-shot learning where
the training and test classes are not necessarily disjoint. Built upon a
variational autoencoder based architecture, consisting of a probabilistic
encoder and a probabilistic conditional decoder, our model can generate novel
exemplars from seen/unseen classes, given their respective class attributes.
These exemplars can subsequently be used to train any off-the-shelf
classification model. One of the key aspects of our encoder-decoder
architecture is a feedback-driven mechanism in which a discriminator (a
multivariate regressor) learns to map the generated exemplars to the
corresponding class attribute vectors, leading to an improved generator. Our
model's ability to generate and leverage examples from unseen classes to train
the classification model naturally helps to mitigate the bias towards
predicting seen classes in generalized zero-shot learning settings. Through a
comprehensive set of experiments, we show that our model outperforms several
state-of-the-art methods, on several benchmark datasets, for both standard as
well as generalized zero-shot learning.Comment: Accepted in CVPR'1
Bidirectional mapping coupled GAN for generalized zero-shot learning
Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE
Resolving Semantic Confusions for Improved Zero-Shot Detection
Zero-shot detection (ZSD) is a challenging task where we aim to recognize and
localize objects simultaneously, even when our model has not been trained with
visual samples of a few target ("unseen") classes. Recently, methods employing
generative models like GANs have shown some of the best results, where
unseen-class samples are generated based on their semantics by a GAN trained on
seen-class data, enabling vanilla object detectors to recognize unseen objects.
However, the problem of semantic confusion still remains, where the model is
sometimes unable to distinguish between semantically-similar classes. In this
work, we propose to train a generative model incorporating a triplet loss that
acknowledges the degree of dissimilarity between classes and reflects them in
the generated samples. Moreover, a cyclic-consistency loss is also enforced to
ensure that generated visual samples of a class highly correspond to their own
semantics. Extensive experiments on two benchmark ZSD datasets - MSCOCO and
PASCAL-VOC - demonstrate significant gains over the current ZSD methods,
reducing semantic confusion and improving detection for the unseen classes.Comment: Accepted to BMVC 2022 (Oral). 15 pages, 5 figures. Project page:
https://github.com/sandipan211/ZSD-SC-Resolve
- …