3,348 research outputs found
Generative Multi-Label Zero-Shot Learning
Multi-label zero-shot learning strives to classify images into multiple
unseen categories for which no data is available during training. The test
samples can additionally contain seen categories in the generalized variant.
Existing approaches rely on learning either shared or label-specific attention
from the seen classes. Nevertheless, computing reliable attention maps for
unseen classes during inference in a multi-label setting is still a challenge.
In contrast, state-of-the-art single-label generative adversarial network (GAN)
based approaches learn to directly synthesize the class-specific visual
features from the corresponding class attribute embeddings. However,
synthesizing multi-label features from GANs is still unexplored in the context
of zero-shot setting. In this work, we introduce different fusion approaches at
the attribute-level, feature-level and cross-level (across attribute and
feature-levels) for synthesizing multi-label features from their corresponding
multi-label class embedding. To the best of our knowledge, our work is the
first to tackle the problem of multi-label feature synthesis in the
(generalized) zero-shot setting. Comprehensive experiments are performed on
three zero-shot image classification benchmarks: NUS-WIDE, Open Images and MS
COCO. Our cross-level fusion-based generative approach outperforms the
state-of-the-art on all three datasets. Furthermore, we show the generalization
capabilities of our fusion approach in the zero-shot detection task on MS COCO,
achieving favorable performance against existing methods. The source code is
available at https://github.com/akshitac8/Generative_MLZSL.Comment: 10 pages, source code is available at
https://github.com/akshitac8/Generative_MLZS
Generalized zero-shot learning using generated proxy unseen samples and entropy separation
The recent generative model-driven Generalized Zero-shot Learning (GZSL) techniques overcome the prevailing issue of the model bias towards the seen classes by synthesizing the visual samples of the unseen classes through leveraging the corresponding semantic prototypes. Although such approaches significantly improve the GZSL performance due to data augmentation, they violate the principal assumption of GZSL regarding the unavailability of semantic information of unseen classes during training. In this work, we propose to use a generative model (GAN) for synthesizing the visual proxy samples while strictly adhering to the standard assumptions of the GZSL. The aforementioned proxy samples are generated by exploring the early training regime of the GAN. We hypothesize that such proxy samples can effectively be used to characterize the average entropy of the label distribution of the samples from the unseen classes. Further, we train a classifier on the visual samples from the seen classes and proxy samples using entropy separation criterion such that an average entropy of the label distribution is low and high, respectively, for the visual samples from the seen classes and the proxy samples. Such entropy separation criterion generalizes well during testing where the samples from the unseen classes exhibit higher entropy than the entropy of the samples from the seen classes. Subsequently, low and high entropy samples are classified using supervised learning and ZSL rather than GZSL. We show the superiority of the proposed method by experimenting on AWA1, CUB, HMDB51, and UCF101 datasets
- …