5,418 research outputs found
Polarimetric Thermal to Visible Face Verification via Self-Attention Guided Synthesis
Polarimetric thermal to visible face verification entails matching two images
that contain significant domain differences. Several recent approaches have
attempted to synthesize visible faces from thermal images for cross-modal
matching. In this paper, we take a different approach in which rather than
focusing only on synthesizing visible faces from thermal faces, we also propose
to synthesize thermal faces from visible faces. Our intuition is based on the
fact that thermal images also contain some discriminative information about the
person for verification. Deep features from a pre-trained Convolutional Neural
Network (CNN) are extracted from the original as well as the synthesized
images. These features are then fused to generate a template which is then used
for verification. The proposed synthesis network is based on the self-attention
generative adversarial network (SAGAN) which essentially allows efficient
attention-guided image synthesis. Extensive experiments on the ARL polarimetric
thermal face dataset demonstrate that the proposed method achieves
state-of-the-art performance.Comment: This work is accepted at the 12th IAPR International Conference On
Biometrics (ICB 2019
Hierarchy Composition GAN for High-fidelity Image Synthesis
Despite the rapid progress of generative adversarial networks (GANs) in image
synthesis in recent years, the existing image synthesis approaches work in
either geometry domain or appearance domain alone which often introduces
various synthesis artifacts. This paper presents an innovative Hierarchical
Composition GAN (HIC-GAN) that incorporates image synthesis in geometry and
appearance domains into an end-to-end trainable network and achieves superior
synthesis realism in both domains simultaneously. We design an innovative
hierarchical composition mechanism that is capable of learning realistic
composition geometry and handling occlusions while multiple foreground objects
are involved in image composition. In addition, we introduce a novel attention
mask mechanism that guides to adapt the appearance of foreground objects which
also helps to provide better training reference for learning in geometry
domain. Extensive experiments on scene text image synthesis, portrait editing
and indoor rendering tasks show that the proposed HIC-GAN achieves superior
synthesis performance qualitatively and quantitatively.Comment: 11 pages, 8 figure
Manipulating Attributes of Natural Scenes via Hallucination
In this study, we explore building a two-stage framework for enabling users
to directly manipulate high-level attributes of a natural scene. The key to our
approach is a deep generative network which can hallucinate images of a scene
as if they were taken at a different season (e.g. during winter), weather
condition (e.g. in a cloudy day) or time of the day (e.g. at sunset). Once the
scene is hallucinated with the given attributes, the corresponding look is then
transferred to the input image while preserving the semantic details intact,
giving a photo-realistic manipulation result. As the proposed framework
hallucinates what the scene will look like, it does not require any reference
style image as commonly utilized in most of the appearance or style transfer
approaches. Moreover, it allows to simultaneously manipulate a given scene
according to a diverse set of transient attributes within a single model,
eliminating the need of training multiple networks per each translation task.
Our comprehensive set of qualitative and quantitative results demonstrate the
effectiveness of our approach against the competing methods.Comment: Accepted for publication in ACM Transactions on Graphic
CATFace: Cross-Attribute-Guided Transformer with Self-Attention Distillation for Low-Quality Face Recognition
Although face recognition (FR) has achieved great success in recent years, it
is still challenging to accurately recognize faces in low-quality images due to
the obscured facial details. Nevertheless, it is often feasible to make
predictions about specific soft biometric (SB) attributes, such as gender, and
baldness even in dealing with low-quality images. In this paper, we propose a
novel multi-branch neural network that leverages SB attribute information to
boost the performance of FR. To this end, we propose a cross-attribute-guided
transformer fusion (CATF) module that effectively captures the long-range
dependencies and relationships between FR and SB feature representations. The
synergy created by the reciprocal flow of information in the dual
cross-attention operations of the proposed CATF module enhances the performance
of FR. Furthermore, we introduce a novel self-attention distillation framework
that effectively highlights crucial facial regions, such as landmarks by
aligning low-quality images with those of their high-quality counterparts in
the feature space. The proposed self-attention distillation regularizes our
network to learn a unified quality-invariant feature representation in
unconstrained environments. We conduct extensive experiments on various FR
benchmarks varying in quality. Experimental results demonstrate the superiority
of our FR method compared to state-of-the-art FR studies.Comment: Accepted in IEEE Transactions on Biometrics, Behavior, and Identity
Science (T-BIOM), 202
- …