70 research outputs found
A novel multispectral and 2.5D/3D image fusion camera system for enhanced face recognition
The fusion of images from the visible and long-wave infrared (thermal) portions of the spectrum
produces images that have improved face recognition performance under varying lighting conditions.
This is because long-wave infrared images are the result of emitted, rather than reflected,
light and are therefore less sensitive to changes in ambient light. Similarly, 3D and 2.5D images
have also improved face recognition under varying pose and lighting. The opacity of glass to
long-wave infrared light, however, means that the presence of eyeglasses in a face image reduces
the recognition performance.
This thesis presents the design and performance evaluation of a novel camera system which is
capable of capturing spatially registered visible, near-infrared, long-wave infrared and 2.5D depth
video images via a common optical path requiring no spatial registration between sensors beyond
scaling for differences in sensor sizes. Experiments using a range of established face recognition
methods and multi-class SVM classifiers show that the fused output from our camera system not
only outperforms the single modality images for face recognition, but that the adaptive fusion
methods used produce consistent increases in recognition accuracy under varying pose, lighting
and with the presence of eyeglasses
Deep Learning for Face Anti-Spoofing: A Survey
Face anti-spoofing (FAS) has lately attracted increasing attention due to its
vital role in securing face recognition systems from presentation attacks
(PAs). As more and more realistic PAs with novel types spring up, traditional
FAS methods based on handcrafted features become unreliable due to their
limited representation capacity. With the emergence of large-scale academic
datasets in the recent decade, deep learning based FAS achieves remarkable
performance and dominates this area. However, existing reviews in this field
mainly focus on the handcrafted features, which are outdated and uninspiring
for the progress of FAS community. In this paper, to stimulate future research,
we present the first comprehensive review of recent advances in deep learning
based FAS. It covers several novel and insightful components: 1) besides
supervision with binary label (e.g., '0' for bonafide vs. '1' for PAs), we also
investigate recent methods with pixel-wise supervision (e.g., pseudo depth
map); 2) in addition to traditional intra-dataset evaluation, we collect and
analyze the latest methods specially designed for domain generalization and
open-set FAS; and 3) besides commercial RGB camera, we summarize the deep
learning applications under multi-modal (e.g., depth and infrared) or
specialized (e.g., light field and flash) sensors. We conclude this survey by
emphasizing current open issues and highlighting potential prospects.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI
A novel multispectral and 2.5D/3D image fusion camera system for enhanced face recognition
The fusion of images from the visible and long-wave infrared (thermal) portions of the spectrum
produces images that have improved face recognition performance under varying lighting conditions.
This is because long-wave infrared images are the result of emitted, rather than reflected,
light and are therefore less sensitive to changes in ambient light. Similarly, 3D and 2.5D images
have also improved face recognition under varying pose and lighting. The opacity of glass to
long-wave infrared light, however, means that the presence of eyeglasses in a face image reduces
the recognition performance.
This thesis presents the design and performance evaluation of a novel camera system which is
capable of capturing spatially registered visible, near-infrared, long-wave infrared and 2.5D depth
video images via a common optical path requiring no spatial registration between sensors beyond
scaling for differences in sensor sizes. Experiments using a range of established face recognition
methods and multi-class SVM classifiers show that the fused output from our camera system not
only outperforms the single modality images for face recognition, but that the adaptive fusion
methods used produce consistent increases in recognition accuracy under varying pose, lighting
and with the presence of eyeglasses
BRUISE DETECTION IN APPLES USING 3D INFRARED IMAGING AND MACHINE LEARNING TECHNOLOGIES
Bruise detection plays an important role in fruit grading. A bruise detection system capable of finding and removing damaged products on the production lines will distinctly improve the quality of fruits for sale, and consequently improve the fruit economy. This dissertation presents a novel automatic detection system based on surface information obtained from 3D near-infrared imaging technique for bruised apple identification. The proposed 3D bruise detection system is expected to provide better performance in bruise detection than the existing 2D systems.
We first propose a mesh denoising filter to reduce noise effect while preserving the geometric features of the meshes. Compared with several existing mesh denoising filters, the proposed filter achieves better performance in reducing noise effect as well as preserving bruised regions in 3D meshes of bruised apples. Next, we investigate two different machine learning techniques for the identification of bruised apples. The first technique is to extract hand-crafted feature from 3D meshes, and train a predictive classifier based on hand-crafted features. It is shown that the predictive model trained on the proposed hand-crafted features outperforms the same models trained on several other local shape descriptors. The second technique is to apply deep learning to learn the feature representation automatically from the mesh data, and then use the deep learning model or a new predictive model for the classification. The optimized deep learning model achieves very high classification accuracy, and it outperforms the performance of the detection system based on the proposed hand-crafted features. At last, we investigate GPU techniques for accelerating the proposed apple bruise detection system. Specifically, the dissertation proposes a GPU framework, implemented in CUDA, for the acceleration of the algorithm that extracts vertex-based local binary patterns. Experimental results show that the proposed GPU program speeds up the process of extracting local binary patterns by 5 times compared to a single-core CPU program
Novel pattern recognition methods for classification and detection in remote sensing and power generation applications
Novel pattern recognition methods for classification and detection in remote sensing and power generation application
Driver Face Verification with Depth Maps
Face verification is the task of checking if two provided images contain the face of the same person or not. In this work, we propose a fully-convolutional Siamese architecture to tackle this task, achieving state-of-the-art results on three publicly-released datasets, namely Pandora, High-Resolution Range-based Face Database (HRRFaceD), and CurtinFaces. The proposed method takes depth maps as the input, since depth cameras have been proven to be more reliable in different illumination conditions. Thus, the system is able to work even in the case of the total or partial absence of external light sources, which is a key feature for automotive applications. From the algorithmic point of view, we propose a fully-convolutional architecture with a limited number of parameters, capable of dealing with the small amount of depth data available for training and able to run in real time even on a CPU and embedded boards. The experimental results show acceptable accuracy to allow exploitation in real-world applications with in-board cameras. Finally, exploiting the presence of faces occluded by various head garments and extreme head poses available in the Pandora dataset, we successfully test the proposed system also during strong visual occlusions. The excellent results obtained confirm the efficacy of the proposed method
Recommended from our members
Fingers micro-gesture recognition based on holoscopic 3D imaging system
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonMicro-gesture recognition has been widely research in recent years, in particular there
has been a great focus on 3D micro-gesture recognition which consists of classifying the
micro-gesture movements of the fingers for touch-less control applications. Holoscopic
3D imaging system mimics fly’s eye technique to capture true 3D scene which is enrich
in both texture and motion information. As a result, holoscopic 3D imaging system shall
be a suitable approach for robust recognition application. This PhD research focuses on
innovative 3D micro-gesture recognition based on holoscopic 3D system which delivers
robust and reliable performance with precision for 3D micro-gestures. Indeed this can
be applied to other wide range of applications such as Internet of things (IoT), AR/VR,
robotics and other touch-less interaction.
Due to lack of holoscopic 3D dataset, a comprehensive 3D micro-gesture dataset (HoMG)
includes both holoscopic 3D images and videos is prepared. It is a reasonable size holoscopic
3D dataset which is captured with different camera settings and conditions from
40 participants. Innovative 3D micro-gesture recognition is proposed based on 2D feature
extraction methods with basic classification methods, the recognition accuracy can reach
around 50.9%. For video-based data, the 3D feature extraction methods are achieved
66.7% recognition accuracy over 50.9% accuracy for micro-gesture images as the initial
investigation. HoMG database held a challenge in IEEE International automatic face and
gesture 2018, and 4 groups from the international research institutes joined the challenge
and contributed many new methods as further development where the proposed method
was published.
The holoscopic 3D dataset further enrich innovative micro-gesture 3D recognition system
is proposed and its performance is evaluated by carrying out like to like comparison
with state of the art methods. In addition, a fast and efficient pre-processing algorithm
for H3D images to extract the element images. Simplified viewpoint image extraction
method are presented. A pre-trained CNN model with the attention mechanics is implemented
based on VP image for the predicted probabilities of gesture. The proposed
approached is further improved using voting strategy. The proposed approach achieves
87% accuracy, which outperform all existing state of the art methods on the image-based
database. Advanced 3D micro-gesture recognition is investigated based on sequence video database,
the end-to-end model has been used on effective H3D based micro-gesture recognition
system. For front-end network, there are two method of traditional viewpoint image
extraction and novel pseudo viewpoint image extraction have been used and evaluated.
The pseudo viewpoint (PVP) front-end has been created, which used to deep learning
networks understanding the implied 3D information of H3D imaging system. The viewpoint
(VP) front-end follows the traditional H3D image method to extract and reconstruct
the multi-viewpoint images. Both front-end have been feed in four popular advanced
deep networks using for learning and classification. This experiments evaluated the performance
of 2D/3D convolutional, mixing 2D and 3D convolutional and LSTM on the
HoMG video database, which is beneficial to H3D imaging system using deep learning
network. Finally, in order to obtain the high accuracies, the majority voting has been applied
for further improve. The final results show that the performance is not only better
than the traditional methods, but also superior to the existing deep learning based approaches,
which clearly demonstrates the effectiveness of the proposed approach
RGB-D And Thermal Sensor Fusion: A Systematic Literature Review
In the last decade, the computer vision field has seen significant progress
in multimodal data fusion and learning, where multiple sensors, including
depth, infrared, and visual, are used to capture the environment across diverse
spectral ranges. Despite these advancements, there has been no systematic and
comprehensive evaluation of fusing RGB-D and thermal modalities to date. While
autonomous driving using LiDAR, radar, RGB, and other sensors has garnered
substantial research interest, along with the fusion of RGB and depth
modalities, the integration of thermal cameras and, specifically, the fusion of
RGB-D and thermal data, has received comparatively less attention. This might
be partly due to the limited number of publicly available datasets for such
applications. This paper provides a comprehensive review of both,
state-of-the-art and traditional methods used in fusing RGB-D and thermal
camera data for various applications, such as site inspection, human tracking,
fault detection, and others. The reviewed literature has been categorised into
technical areas, such as 3D reconstruction, segmentation, object detection,
available datasets, and other related topics. Following a brief introduction
and an overview of the methodology, the study delves into calibration and
registration techniques, then examines thermal visualisation and 3D
reconstruction, before discussing the application of classic feature-based
techniques as well as modern deep learning approaches. The paper concludes with
a discourse on current limitations and potential future research directions. It
is hoped that this survey will serve as a valuable reference for researchers
looking to familiarise themselves with the latest advancements and contribute
to the RGB-DT research field.Comment: 33 pages, 20 figure
- …