5,141 research outputs found
Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
Generalized Zero-Shot Learning (GZSL) identifies unseen categories by
knowledge transferred from the seen domain, relying on the intrinsic
interactions between visual and semantic information. Prior works mainly
localize regions corresponding to the sharing attributes. When various visual
appearances correspond to the same attribute, the sharing attributes inevitably
introduce semantic ambiguity, hampering the exploration of accurate
semantic-visual interactions. In this paper, we deploy the dual semantic-visual
transformer module (DSVTM) to progressively model the correspondences between
attribute prototypes and visual features, constituting a progressive
semantic-visual mutual adaption (PSVMA) network for semantic disambiguation and
knowledge transferability improvement. Specifically, DSVTM devises an
instance-motivated semantic encoder that learns instance-centric prototypes to
adapt to different images, enabling the recast of the unmatched semantic-visual
pair into the matched one. Then, a semantic-motivated instance decoder
strengthens accurate cross-domain interactions between the matched pair for
semantic-related instance adaption, encouraging the generation of unambiguous
visual representations. Moreover, to mitigate the bias towards seen classes in
GZSL, a debiasing loss is proposed to pursue response consistency between seen
and unseen predictions. The PSVMA consistently yields superior performances
against other state-of-the-art methods. Code will be available at:
https://github.com/ManLiuCoder/PSVMA.Comment: Accepted by CVPR202
Bidirectional mapping coupled GAN for generalized zero-shot learning
Bidirectional mapping-based generalized zero-shot learning (GZSL) methods rely on the quality of synthesized features to recognize seen and unseen data. Therefore, learning a joint distribution of seen-unseen classes and preserving the distinction between seen-unseen classes is crucial for GZSL methods. However, existing methods only learn the underlying distribution of seen data, although unseen class semantics are available in the GZSL problem setting. Most methods neglect retaining seen-unseen classes distinction and use the learned distribution to recognize seen and unseen data. Consequently, they do not perform well. In this work, we utilize the available unseen class semantics alongside seen class semantics and learn joint distribution through a strong visual-semantic coupling. We propose a bidirectional mapping coupled generative adversarial network (BMCoGAN) by extending the concept of the coupled generative adversarial network into a bidirectional mapping model. We further integrate a Wasserstein generative adversarial optimization to supervise the joint distribution learning. We design a loss optimization for retaining distinctive information of seen-unseen classes in the synthesized features and reducing bias towards seen classes, which pushes synthesized seen features towards real seen features and pulls synthesized unseen features away from real seen features. We evaluate BMCoGAN on benchmark datasets and demonstrate its superior performance against contemporary methods. © 1992-2012 IEEE
Dual Feature Augmentation Network for Generalized Zero-shot Learning
Zero-shot learning (ZSL) aims to infer novel classes without training samples
by transferring knowledge from seen classes. Existing embedding-based
approaches for ZSL typically employ attention mechanisms to locate attributes
on an image. However, these methods often ignore the complex entanglement among
different attributes' visual features in the embedding space. Additionally,
these methods employ a direct attribute prediction scheme for classification,
which does not account for the diversity of attributes in images of the same
category. To address these issues, we propose a novel Dual Feature Augmentation
Network (DFAN), which comprises two feature augmentation modules, one for
visual features and the other for semantic features. The visual feature
augmentation module explicitly learns attribute features and employs cosine
distance to separate them, thus enhancing attribute representation. In the
semantic feature augmentation module, we propose a bias learner to capture the
offset that bridges the gap between actual and predicted attribute values from
a dataset's perspective. Furthermore, we introduce two predictors to reconcile
the conflicts between local and global features. Experimental results on three
benchmarks demonstrate the marked advancement of our method compared to
state-of-the-art approaches. Our code is available at
https://github.com/Sion1/DFAN.Comment: Accepted to BMVC202
Generalized Zero Shot Learning For Medical Image Classification
In many real world medical image classification settings we do not have
access to samples of all possible disease classes, while a robust system is
expected to give high performance in recognizing novel test data. We propose a
generalized zero shot learning (GZSL) method that uses self supervised learning
(SSL) for: 1) selecting anchor vectors of different disease classes; and 2)
training a feature generator. Our approach does not require class attribute
vectors which are available for natural images but not for medical images. SSL
ensures that the anchor vectors are representative of each class. SSL is also
used to generate synthetic features of unseen classes. Using a simpler
architecture, our method matches a state of the art SSL based GZSL method for
natural images and outperforms all methods for medical images. Our method is
adaptable enough to accommodate class attribute vectors when they are available
for natural images
Image-free Classifier Injection for Zero-Shot Classification
Zero-shot learning models achieve remarkable results on image classification
for samples from classes that were not seen during training. However, such
models must be trained from scratch with specialised methods: therefore, access
to a training dataset is required when the need for zero-shot classification
arises. In this paper, we aim to equip pre-trained models with zero-shot
classification capabilities without the use of image data. We achieve this with
our proposed Image-free Classifier Injection with Semantics (ICIS) that injects
classifiers for new, unseen classes into pre-trained classification models in a
post-hoc fashion without relying on image data. Instead, the existing
classifier weights and simple class-wise descriptors, such as class names or
attributes, are used. ICIS has two encoder-decoder networks that learn to
reconstruct classifier weights from descriptors (and vice versa), exploiting
(cross-)reconstruction and cosine losses to regularise the decoding process.
Notably, ICIS can be cheaply trained and applied directly on top of pre-trained
classification models. Experiments on benchmark ZSL datasets show that ICIS
produces unseen classifier weights that achieve strong (generalised) zero-shot
classification performance. Code is available at
https://github.com/ExplainableML/ImageFreeZSL .Comment: Accepted at ICCV 202
- …