4 research outputs found
Restrictive Voting Technique for Faces Spoofing Attack
Face anti-spoofing has become widely used due to the increasing use of biometric authentication systems that rely on facial recognition. It is a critical issue in biometric authentication systems that aim to prevent unauthorized access. In this paper, we propose a modified version of majority voting that ensembles the votes of six classifiers for multiple video chunks to improve the accuracy of face anti-spoofing. Our approach involves sampling sub-videos of 2 seconds each with a one-second overlap and classifying each sub-video using multiple classifiers. We then ensemble the classifications for each sub-video across all classifiers to decide the complete video classification. We focus on the False Acceptance Rate (FAR) metric to highlight the importance of preventing unauthorized access. We evaluated our method using the Replay Attack dataset and achieved a zero FAR. We also reported the Half Total Error Rate (HTER) and Equal Error Rate (EER) and gained a better result than most state-of-the-art methods. Our experimental results show that our proposed method significantly reduces the FAR, which is crucial for real-world face anti-spoofing applications
Multi-hierarchical Convolutional Network for Efficient Remote Photoplethysmograph Signal and Heart Rate Estimation from Face Video Clips
Heart beat rhythm and heart rate (HR) are important physiological parameters
of the human body. This study presents an efficient multi-hierarchical
spatio-temporal convolutional network that can quickly estimate remote
physiological (rPPG) signal and HR from face video clips. First, the facial
color distribution characteristics are extracted using a low-level face feature
Generation (LFFG) module. Then, the three-dimensional (3D) spatio-temporal
stack convolution module (STSC) and multi-hierarchical feature fusion module
(MHFF) are used to strengthen the spatio-temporal correlation of multi-channel
features. In the MHFF, sparse optical flow is used to capture the tiny motion
information of faces between frames and generate a self-adaptive region of
interest (ROI) skin mask. Finally, the signal prediction module (SP) is used to
extract the estimated rPPG signal. The experimental results on the three
datasets show that the proposed network outperforms the state-of-the-art
methods.Comment: 33 pages,9 figure
rPPG-Toolbox: Deep Remote PPG Toolbox
Camera-based physiological measurement is a fast growing field of computer
vision. Remote photoplethysmography (rPPG) utilizes imaging devices (e.g.,
cameras) to measure the peripheral blood volume pulse (BVP) via
photoplethysmography, and enables cardiac measurement via webcams and
smartphones. However, the task is non-trivial with important pre-processing,
modeling, and post-processing steps required to obtain state-of-the-art
results. Replication of results and benchmarking of new models is critical for
scientific progress; however, as with many other applications of deep learning,
reliable codebases are not easy to find or use. We present a comprehensive
toolbox, rPPG-Toolbox, that contains unsupervised and supervised rPPG models
with support for public benchmark datasets, data augmentation, and systematic
evaluation: \url{https://github.com/ubicomplab/rPPG-Toolbox
Video-based remote physiological measurement via cross-verified feature disentangling
Abstract
Remote physiological measurements, e.g., remote photoplethysmography (rPPG) based heart rate (HR), heart rate variability (HRV) and respiration frequency (RF) measuring, are playing more and more important roles under the application scenarios where contact measurement is inconvenient or impossible. Since the amplitude of the physiological signals is very small, they can be easily affected by head movements, lighting conditions, and sensor diversities. To address these challenges, we propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations, and then use the distilled physiological features for robust multi-task physiological measurements. We first transform the input face videos into a multi-scale spatial-temporal map (MSTmap), which can suppress the irrelevant background and noise features while retaining most of the temporal characteristics of the periodic physiological signals. Then we take pairwise MSTmaps as inputs to an autoencoder architecture with two encoders (one for physiological signals and the other for non-physiological information) and use a cross-verified scheme to obtain physiological features disentangled with the non-physiological features. The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and rPPG signals. Comprehensive experiments on different large-scale public datasets of multiple physiological measurement tasks as well as the cross-database testing demonstrate the robustness of our approach