181 research outputs found
Deep Ensemble Learning with Frame Skipping for Face Anti-Spoofing
Face presentation attacks, also known as spoofing attacks, pose a significant
threat to biometric systems that rely on facial recognition systems, such as
access control systems, mobile payments, and identity verification systems. To
prevent spoofing, several video-based methods have been presented in the
literature that analyze facial motion in successive video frames. However,
estimating the motion between adjacent frames is a challenging task and
requires high computational cost. In this paper, we reformulate the face
anti-spoofing task as a motion prediction problem and introduce a deep ensemble
learning model with a frame skipping mechanism. The proposed frame skipping is
based on a uniform sampling approach where the original video is divided into
fixed size video clips. In this way, every nth frame of the clip is selected to
ensure that the temporal patterns can easily be perceived during the training
of three different recurrent neural networks (RNNs). Motivated by the
performance of each RNNs, a meta-model is developed to improve the overall
recognition performance by combining the predictions of the individual RNNs.
Extensive experiments were conducted on four datasets, and state-of-the-art
performance is reported for MSU-MFSD (3.12\%), Replay-Attack (11.19\%), and
OULU-NPU (12.23\%) using half total error rate (HTER) in the most challenging
cross-dataset test scenario
Domain Generalization via Ensemble Stacking for Face Presentation Attack Detection
Face Presentation Attack Detection (PAD) plays a pivotal role in securing
face recognition systems against spoofing attacks. Although great progress has
been made in designing face PAD methods, developing a model that can generalize
well to unseen test domains remains a significant challenge. Moreover, due to
different types of spoofing attacks, creating a dataset with a sufficient
number of samples for training deep neural networks is a laborious task. This
work proposes a comprehensive solution that combines synthetic data generation
and deep ensemble learning to enhance the generalization capabilities of face
PAD. Specifically, synthetic data is generated by blending a static image with
spatiotemporal encoded images using alpha composition and video distillation.
This way, we simulate motion blur with varying alpha values, thereby generating
diverse subsets of synthetic data that contribute to a more enriched training
set. Furthermore, multiple base models are trained on each subset of synthetic
data using stacked ensemble learning. This allows the models to learn
complementary features and representations from different synthetic subsets.
The meta-features generated by the base models are used as input to a new model
called the meta-model. The latter combines the predictions from the base
models, leveraging their complementary information to better handle unseen
target domains and enhance the overall performance. Experimental results on
four datasets demonstrate low half total error rates (HTERs) on three benchmark
datasets: CASIA-MFSD (8.92%), MSU-MFSD (4.81%), and OULU-NPU (6.70%). The
approach shows potential for advancing presentation attack detection by
utilizing large-scale synthetic data and the meta-model
Semi-Supervised learning for Face Anti-Spoofing using Apex frame
Conventional feature extraction techniques in the face anti-spoofing domain
either analyze the entire video sequence or focus on a specific segment to
improve model performance. However, identifying the optimal frames that provide
the most valuable input for the face anti-spoofing remains a challenging task.
In this paper, we address this challenge by employing Gaussian weighting to
create apex frames for videos. Specifically, an apex frame is derived from a
video by computing a weighted sum of its frames, where the weights are
determined using a Gaussian distribution centered around the video's central
frame. Furthermore, we explore various temporal lengths to produce multiple
unlabeled apex frames using a Gaussian function, without the need for
convolution. By doing so, we leverage the benefits of semi-supervised learning,
which considers both labeled and unlabeled apex frames to effectively
discriminate between live and spoof classes. Our key contribution emphasizes
the apex frame's capacity to represent the most significant moments in the
video, while unlabeled apex frames facilitate efficient semi-supervised
learning, as they enable the model to learn from videos of varying temporal
lengths. Experimental results using four face anti-spoofing databases: CASIA,
REPLAY-ATTACK, OULU-NPU, and MSU-MFSD demonstrate the apex frame's efficacy in
advancing face anti-spoofing techniques
Face liveness detection by rPPG features and contextual patch-based CNN
Abstract. Face anti-spoofing plays a vital role in security systems including face payment systems and face recognition systems. Previous studies showed that live faces and presentation attacks have significant differences in both remote photoplethysmography (rPPG) and texture information. We propose a generalized method exploiting both rPPG and texture features for face anti-spoofing task. First, we design multi-scale long-term statistical spectral (MS-LTSS) features with variant granularities for the representation of rPPG information. Second, a contextual patch-based convolutional neural network (CP-CNN) is used for extracting global-local and multi-level deep texture features simultaneously. Finally, weight summation strategy is employed for decision level fusion of the two types of features, which allow the proposed system to be generalized for detecting not only print attack and replay attack, but also mask attack. Comprehensive experiments were conducted on five databases, namely 3DMAD, HKBU-Mars V1, MSU-MFSD, CASIA-FASD, and OULU-NPU, to show the superior results of the proposed method compared with state-of-the-art methods.Tiivistelmä. Kasvojen anti-spoofingilla on keskeinen rooli turvajärjestelmissä, mukaan lukien kasvojen maksujärjestelmät ja kasvojentunnistusjärjestelmät. Aiemmat tutkimukset osoittivat, että elävillä kasvoilla ja esityshyökkäyksillä on merkittäviä eroja sekä etävalopölymografiassa (rPPG) että tekstuuri-informaatiossa, ehdotamme yleistettyä menetelmää, jossa hyödynnetään sekä rPPG: tä että tekstuuriominaisuuksia kasvojen anti-spoofing -tehtävässä. Ensinnäkin rPPG-informaation esittämiseksi on suunniteltu monivaiheisia pitkän aikavälin tilastollisia spektrisiä (MS-LTSS) ominaisuuksia, joissa on muunneltavissa olevat granulariteetit. Toiseksi, kontekstuaalista patch-pohjaista konvoluutioverkkoa (CP-CNN) käytetään globaalin paikallisen ja monitasoisen syvään tekstuuriominaisuuksiin samanaikaisesti. Lopuksi, painoarvostusstrategiaa käytetään päätöksentekotason fuusioon, joka auttaa yleistämään menetelmää paitsi hyökkäys- ja toistoiskuille, mutta myös peittää hyökkäyksen. Kattavat kokeet suoritettiin viidellä tietokannalla, nimittäin 3DMAD, HKBU-Mars V1, MSU-MFSD, CASIA-FASD ja OULU-NPU, ehdotetun menetelmän parempien tulosten osoittamiseksi verrattuna uusimpiin menetelmiin
Replay detection in voice biometrics: an investigation of adaptive and non-adaptive front-ends
Among various physiological and behavioural traits, speech has gained popularity as an effective mode of biometric authentication. Even though they are gaining popularity, automatic speaker verification systems are vulnerable to malicious attacks, known as spoofing attacks. Among various types of spoofing attacks, replay attack poses the biggest threat due to its simplicity and effectiveness. This thesis investigates the importance of 1) improving front-end feature extraction via novel feature extraction techniques and 2) enhancing spectral components via adaptive front-end frameworks to improve replay attack detection.
This thesis initially focuses on AM-FM modelling techniques and their use in replay attack detection. A novel method to extract the sub-band frequency modulation (FM) component using the spectral centroid of a signal is proposed, and its use as a potential acoustic feature is also discussed. Frequency Domain Linear Prediction (FDLP) is explored as a method to obtain the temporal envelope of a speech signal. The temporal envelope carries amplitude modulation (AM) information of speech resonances. Several features are extracted from the temporal envelope and the FDLP residual signal. These features are then evaluated for replay attack detection and shown to have significant capability in discriminating genuine and spoofed signals. Fusion of AM and FM-based features has shown that AM and FM carry complementary information that helps distinguish replayed signals from genuine ones. The importance of frequency band allocation when creating filter banks is studied as well to further advance the understanding of front-ends for replay attack detection.
Mechanisms inspired by the human auditory system that makes the human ear an excellent spectrum analyser have been investigated and integrated into front-ends. Spatial differentiation, a mechanism that provides additional sharpening to auditory filters is one of them that is used in this work to improve the selectivity of the sub-band decomposition filters. Two features are extracted using the improved filter bank front-end: spectral envelope centroid magnitude (SECM) and spectral envelope centroid frequency (SECF). These are used to establish the positive effect of spatial differentiation on discriminating spoofed signals. Level-dependent filter tuning, which allows the ear to handle a large dynamic range, is integrated into the filter bank to further improve the front-end. This mechanism converts the filter bank into an adaptive one where the selectivity of the filters is varied based on the input signal energy. Experimental results show that this leads to improved spoofing detection performance.
Finally, deep neural network (DNN) mechanisms are integrated into sub-band feature extraction to develop an adaptive front-end that adjusts its characteristics based on the sub-band signals. A DNN-based controller that takes sub-band FM components as input, is developed to adaptively control the selectivity and sensitivity of a parallel filter bank to enhance the artifacts that differentiate a replayed signal from a genuine signal. This work illustrates gradient-based optimization of a DNN-based controller using the feedback from a spoofing detection back-end classifier, thus training it to reduce spoofing detection error. The proposed framework has displayed a superior ability in identifying high-quality replayed signals compared to conventional non-adaptive frameworks.
All techniques proposed in this thesis have been evaluated on well-established databases on replay attack detection and compared with state-of-the-art baseline systems
Intelligent Phishing Detection Scheme Using Deep Learning Algorithms
Purpose:
Phishing attacks have evolved in recent years due to high-tech-enabled economic growth worldwide. The rise in all types of fraud loss in 2019 has been attributed to the increase in deception scams and impersonation, as well as to sophisticated online attacks such as phishing. The global impact of phishing attacks will continue to intensify, and thus, a more efficient phishing detection method is required to protect online user activities. To address this need, this study focussed on the design and development of a deep learning-based phishing detection solution that leveraged the universal resource locator and website content such as images, text and frames.
Design/methodology/approach:
Deep learning techniques are efficient for natural language and image classification. In this study, the convolutional neural network (CNN) and the long short-term memory (LSTM) algorithm were used to build a hybrid classification model named the intelligent phishing detection system (IPDS). To build the proposed model, the CNN and LSTM classifier were trained by using 1m universal resource locators and over 10,000 images. Then, the sensitivity of the proposed model was determined by considering various factors such as the type of feature, number of misclassifications and split issues.
Findings:
An extensive experimental analysis was conducted to evaluate and compare the effectiveness of the IPDS in detecting phishing web pages and phishing attacks when applied to large data sets. The results showed that the model achieved an accuracy rate of 93.28% and an average detection time of 25 s.
Originality/value:
The hybrid approach using deep learning algorithm of both the CNN and LSTM methods was used in this research work. On the one hand, the combination of both CNN and LSTM was used to resolve the problem of a large data set and higher classifier prediction performance. Hence, combining the two methods leads to a better result with less training time for LSTM and CNN architecture, while using the image, frame and text features as a hybrid for our model detection. The hybrid features and IPDS classifier for phishing detection were the novelty of this study to the best of the authors' knowledge
- …