4 research outputs found
Fine-Grained Zero-Shot Learning: Advances, Challenges, and Prospects
Recent zero-shot learning (ZSL) approaches have integrated fine-grained
analysis, i.e., fine-grained ZSL, to mitigate the commonly known seen/unseen
domain bias and misaligned visual-semantics mapping problems, and have made
profound progress. Notably, this paradigm differs from existing close-set
fine-grained methods and, therefore, can pose unique and nontrivial challenges.
However, to the best of our knowledge, there remains a lack of systematic
summaries of this topic. To enrich the literature of this domain and provide a
sound basis for its future development, in this paper, we present a broad
review of recent advances for fine-grained analysis in ZSL. Concretely, we
first provide a taxonomy of existing methods and techniques with a thorough
analysis of each category. Then, we summarize the benchmark, covering publicly
available datasets, models, implementations, and some more details as a
library. Last, we sketch out some related applications. In addition, we discuss
vital challenges and suggest potential future directions.Comment: 9 pages, 1 figure, 4 table
Generative Multi-Label Zero-Shot Learning
Multi-label zero-shot learning strives to classify images into multiple
unseen categories for which no data is available during training. The test
samples can additionally contain seen categories in the generalized variant.
Existing approaches rely on learning either shared or label-specific attention
from the seen classes. Nevertheless, computing reliable attention maps for
unseen classes during inference in a multi-label setting is still a challenge.
In contrast, state-of-the-art single-label generative adversarial network (GAN)
based approaches learn to directly synthesize the class-specific visual
features from the corresponding class attribute embeddings. However,
synthesizing multi-label features from GANs is still unexplored in the context
of zero-shot setting. In this work, we introduce different fusion approaches at
the attribute-level, feature-level and cross-level (across attribute and
feature-levels) for synthesizing multi-label features from their corresponding
multi-label class embedding. To the best of our knowledge, our work is the
first to tackle the problem of multi-label feature synthesis in the
(generalized) zero-shot setting. Comprehensive experiments are performed on
three zero-shot image classification benchmarks: NUS-WIDE, Open Images and MS
COCO. Our cross-level fusion-based generative approach outperforms the
state-of-the-art on all three datasets. Furthermore, we show the generalization
capabilities of our fusion approach in the zero-shot detection task on MS COCO,
achieving favorable performance against existing methods. The source code is
available at https://github.com/akshitac8/Generative_MLZSL.Comment: 10 pages, source code is available at
https://github.com/akshitac8/Generative_MLZS