4,332 research outputs found
Infrared face recognition: a comprehensive review of methodologies and databases
Automatic face recognition is an area with immense practical potential which
includes a wide range of commercial and law enforcement applications. Hence it
is unsurprising that it continues to be one of the most active research areas
of computer vision. Even after over three decades of intense research, the
state-of-the-art in face recognition continues to improve, benefitting from
advances in a range of different research fields such as image processing,
pattern recognition, computer graphics, and physiology. Systems based on
visible spectrum images, the most researched face recognition modality, have
reached a significant level of maturity with some practical success. However,
they continue to face challenges in the presence of illumination, pose and
expression changes, as well as facial disguises, all of which can significantly
decrease recognition accuracy. Amongst various approaches which have been
proposed in an attempt to overcome these limitations, the use of infrared (IR)
imaging has emerged as a particularly promising research direction. This paper
presents a comprehensive and timely review of the literature on this subject.
Our key contributions are: (i) a summary of the inherent properties of infrared
imaging which makes this modality promising in the context of face recognition,
(ii) a systematic review of the most influential approaches, with a focus on
emerging common trends as well as key differences between alternative
methodologies, (iii) a description of the main databases of infrared facial
images available to the researcher, and lastly (iv) a discussion of the most
promising avenues for future research.Comment: Pattern Recognition, 2014. arXiv admin note: substantial text overlap
with arXiv:1306.160
Automatic Image Registration in Infrared-Visible Videos using Polygon Vertices
In this paper, an automatic method is proposed to perform image registration
in visible and infrared pair of video sequences for multiple targets. In
multimodal image analysis like image fusion systems, color and IR sensors are
placed close to each other and capture a same scene simultaneously, but the
videos are not properly aligned by default because of different fields of view,
image capturing information, working principle and other camera specifications.
Because the scenes are usually not planar, alignment needs to be performed
continuously by extracting relevant common information. In this paper, we
approximate the shape of the targets by polygons and use affine transformation
for aligning the two video sequences. After background subtraction, keypoints
on the contour of the foreground blobs are detected using DCE (Discrete Curve
Evolution)technique. These keypoints are then described by the local shape at
each point of the obtained polygon. The keypoints are matched based on the
convexity of polygon's vertices and Euclidean distance between them. Only good
matches for each local shape polygon in a frame, are kept. To achieve a global
affine transformation that maximises the overlapping of infrared and visible
foreground pixels, the matched keypoints of each local shape polygon are stored
temporally in a buffer for a few number of frames. The matrix is evaluated at
each frame using the temporal buffer and the best matrix is selected, based on
an overlapping ratio criterion. Our experimental results demonstrate that this
method can provide highly accurate registered images and that we outperform a
previous related method
CHITNet: A Complementary to Harmonious Information Transfer Network for Infrared and Visible Image Fusion
Current infrared and visible image fusion (IVIF) methods go to great lengths
to excavate complementary features and design complex fusion strategies, which
is extremely challenging. To this end, we rethink the IVIF outside the box,
proposing a complementary to harmonious information transfer network (CHITNet).
It reasonably transfers complementary information into harmonious one, which
integrates both the shared and complementary features from two modalities.
Specifically, to skillfully sidestep aggregating complementary information in
IVIF, we design a mutual information transfer (MIT) module to mutually
represent features from two modalities, roughly transferring complementary
information into harmonious one. Then, a harmonious information acquisition
supervised by source image (HIASSI) module is devised to further ensure the
complementary to harmonious information transfer after MIT. Meanwhile, we also
propose a structure information preservation (SIP) module to guarantee that the
edge structure information of the source images can be transferred to the
fusion results. Moreover, a mutual promotion training paradigm (MPTP) with
interaction loss is adopted to facilitate better collaboration among MIT,
HIASSI and SIP. In this way, the proposed method is able to generate fused
images with higher qualities. Extensive experimental results demonstrate the
superiority of our CHITNet over state-of-the-art algorithms in terms of visual
quality and quantitative evaluations
Improving Misaligned Multi-modality Image Fusion with One-stage Progressive Dense Registration
Misalignments between multi-modality images pose challenges in image fusion,
manifesting as structural distortions and edge ghosts. Existing efforts
commonly resort to registering first and fusing later, typically employing two
cascaded stages for registration,i.e., coarse registration and fine
registration. Both stages directly estimate the respective target deformation
fields. In this paper, we argue that the separated two-stage registration is
not compact, and the direct estimation of the target deformation fields is not
accurate enough. To address these challenges, we propose a Cross-modality
Multi-scale Progressive Dense Registration (C-MPDR) scheme, which accomplishes
the coarse-to-fine registration exclusively using a one-stage optimization,
thus improving the fusion performance of misaligned multi-modality images.
Specifically, two pivotal components are involved, a dense Deformation Field
Fusion (DFF) module and a Progressive Feature Fine (PFF) module. The DFF
aggregates the predicted multi-scale deformation sub-fields at the current
scale, while the PFF progressively refines the remaining misaligned features.
Both work together to accurately estimate the final deformation fields. In
addition, we develop a Transformer-Conv-based Fusion (TCF) subnetwork that
considers local and long-range feature dependencies, allowing us to capture
more informative features from the registered infrared and visible images for
the generation of high-quality fused images. Extensive experimental analysis
demonstrates the superiority of the proposed method in the fusion of misaligned
cross-modality images
IAIFNet: An Illumination-Aware Infrared and Visible Image Fusion Network
Infrared and visible image fusion (IVIF) is used to generate fusion images
with comprehensive features of both images, which is beneficial for downstream
vision tasks. However, current methods rarely consider the illumination
condition in low-light environments, and the targets in the fused images are
often not prominent. To address the above issues, we propose an
Illumination-Aware Infrared and Visible Image Fusion Network, named as IAIFNet.
In our framework, an illumination enhancement network first estimates the
incident illumination maps of input images. Afterwards, with the help of
proposed adaptive differential fusion module (ADFM) and salient target aware
module (STAM), an image fusion network effectively integrates the salient
features of the illumination-enhanced infrared and visible images into a fusion
image of high visual quality. Extensive experimental results verify that our
method outperforms five state-of-the-art methods of fusing infrared and visible
images.Comment: Submitted to IEE
On Person Authentication by Fusing Visual and Thermal Face Biometrics
Recognition algorithms that use data obtained by imaging faces in the thermal spectrum are promising in achieving invariance to extreme illumination changes that are often present in practice. In this paper we analyze the performance of a recently proposed face recognition algorithm that combines visual and thermal modalities by decision level fusion. We examine (i) the effects of the proposed data preprocessing in each domain, (ii) the contribution to improved recognition of different types of features, (iii) the importance of prescription glasses detection, in the context of both 1-to-N and 1-to-1 matching (recognition vs. verification performance). Finally, we discuss the significance of our results and, in particular, identify a number of limitations of the current state-of-the-art and propose promising directions for future research
- …