202 research outputs found

    Spoof detection using time-delay shallow neural network and feature switching

    Full text link
    Detecting spoofed utterances is a fundamental problem in voice-based biometrics. Spoofing can be performed either by logical accesses like speech synthesis, voice conversion or by physical accesses such as replaying the pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based speaker verification approach, this paper proposes a time-delay shallow neural network (TD-SNN) for spoof detection for both logical and physical access. The novelty of the proposed TD-SNN system vis-a-vis conventional DNN systems is that it can handle variable length utterances during testing. Performance of the proposed TD-SNN systems and the baseline Gaussian mixture models (GMMs) is analyzed on the ASV-spoof-2019 dataset. The performance of the systems is measured in terms of the minimum normalized tandem detection cost function (min-t-DCF). When studied with individual features, the TD-SNN system consistently outperforms the GMM system for physical access. For logical access, GMM surpasses TD-SNN systems for certain individual features. When combined with the decision-level feature switching (DLFS) paradigm, the best TD-SNN system outperforms the best baseline GMM system on evaluation data with a relative improvement of 48.03\% and 49.47\% for both logical and physical access, respectively

    High-Performance Fake Voice Detection on Automatic Speaker Verification Systems for the Prevention of Cyber Fraud with Convolutional Neural Networks

    Get PDF
    This study proposes a highly effective data analytics approach to prevent cyber fraud on automatic speaker verification systems by classifying histograms of genuine and spoofed voice recordings. Our deep learning-based lightweight architecture advances the application of fake voice detection on embedded systems. It sets a new benchmark with a balanced accuracy of 95.64% and an equal error rate of 4.43%, contributing to adopting artificial intelligence technologies in organizational systems and technologies. As fake voice-related fraud causes monetary damage and serious privacy concerns for various applications, our approach improves the security of such services, being of high practical relevance. Furthermore, the post-hoc analysis of our results reveals that our model confirms image texture analysis-related findings of prior studies and discovers further voice signal features (i.e., textural and contextual) that can advance future work in this field

    Bridging the Spoof Gap: A Unified Parallel Aggregation Network for Voice Presentation Attacks

    Full text link
    Automatic Speaker Verification (ASV) systems are increasingly used in voice bio-metrics for user authentication but are susceptible to logical and physical spoofing attacks, posing security risks. Existing research mainly tackles logical or physical attacks separately, leading to a gap in unified spoofing detection. Moreover, when existing systems attempt to handle both types of attacks, they often exhibit significant disparities in the Equal Error Rate (EER). To bridge this gap, we present a Parallel Stacked Aggregation Network that processes raw audio. Our approach employs a split-transform-aggregation technique, dividing utterances into convolved representations, applying transformations, and aggregating the results to identify logical (LA) and physical (PA) spoofing attacks. Evaluation of the ASVspoof-2019 and VSDC datasets shows the effectiveness of the proposed system. It outperforms state-of-the-art solutions, displaying reduced EER disparities and superior performance in detecting spoofing attacks. This highlights the proposed method's generalizability and superiority. In a world increasingly reliant on voice-based security, our unified spoofing detection system provides a robust defense against a spectrum of voice spoofing attacks, safeguarding ASVs and user data effectively

    Presentation Attack Detection in Facial Biometric Authentication

    Get PDF
    Biometric systems are referred to those structures that enable recognizing an individual, or specifically a characteristic, using biometric data and mathematical algorithms. These are known to be widely employed in various organizations and companies, mostly as authentication systems. Biometric authentic systems are usually much more secure than a classic one, however they also have some loopholes. Presentation attacks indicate those attacks which spoof the biometric systems or sensors. The presentation attacks covered in this project are: photo attacks and deepfake attacks. In the case of photo attacks, it is observed that interactive action check like Eye Blinking proves efficient in detecting liveness. The Convolutional Neural Network (CNN) model trained on the dataset gave 95% accuracy. In the case of deepfake attacks, it is found out that the deepfake videos and photos are generated by complex Generative Adversarial Networks (GANs) and are difficult for human eye to figure out. However, through experiments, it was observed that comprehensive analysis on the frequency domain divulges a lot of vulnerabilities in the GAN generated images. This makes it easier to separate these fake face images from real live faces. The project documents that with frequency analysis, simple linear models as well as complex models give high accuracy results. The models are trained on StyleGAN generated fake images, Flickr-Faces-HQ Dataset and Reface app generated video dataset. Logistic Regression turns out to be the best classifier with test accuracies of 99.67% and 97.96% on two different datasets. Future research can be conducted on different types of presentation attacks like using video, 3-D rendered face mask or advanced GAN generated deepfakes

    Learning Domain Invariant Information to Enhance Presentation Attack Detection in Visible Face Recognition Systems

    Get PDF
    Face signatures, including size, shape, texture, skin tone, eye color, appearance, and scars/marks, are widely used as discriminative, biometric information for access control. Despite recent advancements in facial recognition systems, presentation attacks on facial recognition systems have become increasingly sophisticated. The ability to detect presentation attacks or spoofing attempts is a pressing concern for the integrity, security, and trust of facial recognition systems. Multi-spectral imaging has been previously introduced as a way to improve presentation attack detection by utilizing sensors that are sensitive to different regions of the electromagnetic spectrum (e.g., visible, near infrared, long-wave infrared). Although multi-spectral presentation attack detection systems may be discriminative, the need for additional sensors and computational resources substantially increases complexity and costs. Instead, we propose a method that exploits information from infrared imagery during training to increase the discriminability of visible-based presentation attack detection systems. We introduce (1) a new cross-domain presentation attack detection framework that increases the separability of bonafide and presentation attacks using only visible spectrum imagery, (2) an inverse domain regularization technique for added training stability when optimizing our cross-domain presentation attack detection framework, and (3) a dense domain adaptation subnetwork to transform representations between visible and non-visible domains. Adviser: Benjamin Rigga

    An Efficient CNN-Based Deep Learning Model to Detect Malware Attacks (CNN-DMA) in 5G-IoT Healthcare Applications

    Get PDF
    The role of 5G-IoT has become indispensable in smart applications and it plays a crucial part in e-health applications. E-health applications require intelligent schemes and architectures to overcome the security threats against the sensitive data of patients. The information in e-healthcare applications is stored in the cloud which is vulnerable to security attacks. However, with deep learning techniques, these attacks can be detected, which needs hybrid models. In this article, a new deep learning model (CNN-DMA) is proposed to detect malware attacks based on a classifier—Convolution Neural Network (CNN). The model uses three layers, i.e., Dense, Dropout, and Flatten. Batch sizes of 64, 20 epoch, and 25 classes are used to train the network. An input image of 32 × 32 × 1 is used for the initial convolutional layer. Results are retrieved on the Malimg dataset where 25 families of malware are fed as input and our model has detected is Alueron.gen!J malware. The proposed model CNN-DMA is 99% accurate and it is validated with state-of-the-art techniques

    Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward

    Full text link
    Malicious actors may seek to use different voice-spoofing attacks to fool ASV systems and even use them for spreading misinformation. Various countermeasures have been proposed to detect these spoofing attacks. Due to the extensive work done on spoofing detection in automated speaker verification (ASV) systems in the last 6-7 years, there is a need to classify the research and perform qualitative and quantitative comparisons on state-of-the-art countermeasures. Additionally, no existing survey paper has reviewed integrated solutions to voice spoofing evaluation and speaker verification, adversarial/antiforensics attacks on spoofing countermeasures, and ASV itself, or unified solutions to detect multiple attacks using a single model. Further, no work has been done to provide an apples-to-apples comparison of published countermeasures in order to assess their generalizability by evaluating them across corpora. In this work, we conduct a review of the literature on spoofing detection using hand-crafted features, deep learning, end-to-end, and universal spoofing countermeasure solutions to detect speech synthesis (SS), voice conversion (VC), and replay attacks. Additionally, we also review integrated solutions to voice spoofing evaluation and speaker verification, adversarial and anti-forensics attacks on voice countermeasures, and ASV. The limitations and challenges of the existing spoofing countermeasures are also presented. We report the performance of these countermeasures on several datasets and evaluate them across corpora. For the experiments, we employ the ASVspoof2019 and VSDC datasets along with GMM, SVM, CNN, and CNN-GRU classifiers. (For reproduceability of the results, the code of the test bed can be found in our GitHub Repository
    corecore