1,019 research outputs found
Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification
Gait-based person re-identification (Re-ID) is valuable for safety-critical
applications, and using only 3D skeleton data to extract discriminative gait
features for person Re-ID is an emerging open topic. Existing methods either
adopt hand-crafted features or learn gait features by traditional supervised
learning paradigms. Unlike previous methods, we for the first time propose a
generic gait encoding approach that can utilize unlabeled skeleton data to
learn gait representations in a self-supervised manner. Specifically, we first
propose to introduce self-supervision by learning to reconstruct input skeleton
sequences in reverse order, which facilitates learning richer high-level
semantics and better gait representations. Second, inspired by the fact that
motion's continuity endows temporally adjacent skeletons with higher
correlations ("locality"), we propose a locality-aware attention mechanism that
encourages learning larger attention weights for temporally adjacent skeletons
when reconstructing current skeleton, so as to learn locality when encoding
gait. Finally, we propose Attention-based Gait Encodings (AGEs), which are
built using context vectors learned by locality-aware attention, as final gait
representations. AGEs are directly utilized to realize effective person Re-ID.
Our approach typically improves existing skeleton-based methods by 10-20%
Rank-1 accuracy, and it achieves comparable or even superior performance to
multi-modal methods with extra RGB or depth information. Our codes are
available at https://github.com/Kali-Hac/SGE-LA.Comment: Accepted at IJCAI 2020 Main Track. Sole copyright holder is IJCAI.
Codes are available at https://github.com/Kali-Hac/SGE-L
Attribute Prototype Network for Zero-Shot Learning
From the beginning of zero-shot learning research, visual attributes have
been shown to play an important role. In order to better transfer
attribute-based knowledge from known to unknown classes, we argue that an image
representation with integrated attribute localization ability would be
beneficial for zero-shot learning. To this end, we propose a novel zero-shot
representation learning framework that jointly learns discriminative global and
local features using only class-level attributes. While a visual-semantic
embedding layer learns global features, local features are learned through an
attribute prototype network that simultaneously regresses and decorrelates
attributes from intermediate features. We show that our locality augmented
image representations achieve a new state-of-the-art on three zero-shot
learning benchmarks. As an additional benefit, our model points to the visual
evidence of the attributes in an image, e.g. for the CUB dataset, confirming
the improved attribute localization ability of our image representation.Comment: NeurIPS 2020. The code is publicly available at
https://wenjiaxu.github.io/APN-ZSL
- …