Search CORE

273 research outputs found

Audio Deepfake Detection: A Survey

Author: Tao Jianhua
Wang Chenglong
Yi Jiangyan
Zhang Chu Yuan
Zhang Xiaohui
Zhao Yan
Publication venue
Publication date: 28/08/2023
Field of study

Audio deepfake detection is an emerging active topic. A growing number of literatures have aimed to study deepfake detection algorithms and achieved effective performance, the problem of which is far from being solved. Although there are some review literatures, there has been no comprehensive survey that provides researchers with a systematic overview of these developments with a unified evaluation. Accordingly, in this survey paper, we first highlight the key differences across various types of deepfake audio, then outline and analyse competitions, datasets, features, classifications, and evaluation of state-of-the-art approaches. For each aspect, the basic techniques, advanced developments and major challenges are discussed. In addition, we perform a unified comparison of representative features and classifiers on ASVspoof 2021, ADD 2023 and In-the-Wild datasets for audio deepfake detection, respectively. The survey shows that future research should address the lack of large scale datasets in the wild, poor generalization of existing detection methods to unknown fake attacks, as well as interpretability of detection results

arXiv.org e-Print Archive

Effects of Waveform PMF on Anti-Spoofing Detection

Author: Bonastre Jean-François
Lapidot Itshak
Publication venue: 'International Speech Communication Association'
Publication date: 15/09/2019
Field of study

International audienceIn the context of detection of speaker recognition identity impersonation , we observed that the waveform probability mass function (PMF) of genuine speech differs from significantly of of PMF from identity theft extracts. This is true for synthesized or converted speech as well as for replayed speech. In this work, we mainly ask whether this observation has a significant impact on spoofing detection performance. In a second step, we want to reduce the distribution gap of waveforms between authentic speech and spoofing speech. We propose a genuiniza-tion of the spoofing speech (by analogy with Gaussianisation), i.e. to obtain spoofing speech with a PMF close to the PMF of genuine speech. Our genuinization is evaluated on ASVspoof 2019 challenge datasets, using the baseline system provided by the challenge organization. In the case of constant Q cep-stral coefficients (CQCC) features, the genuinization leads to a degradation of the baseline system performance by a factor of 10, which shows a potentially large impact of the distribution os waveforms on spoofing detection performance. However, by ''playing" with all configurations, we also observed different behaviors, including performance improvements in specific cases. This leads us to conclude that waveform distribution plays an important role and must be taken into account by anti-spoofing systems

Crossref

Replay detection in voice biometrics: an investigation of adaptive and non-adaptive front-ends

Author: Wickramasinghe Buddhi
Publication venue: UNSW, Sydney
Publication date: 01/01/2021
Field of study

Among various physiological and behavioural traits, speech has gained popularity as an effective mode of biometric authentication. Even though they are gaining popularity, automatic speaker verification systems are vulnerable to malicious attacks, known as spoofing attacks. Among various types of spoofing attacks, replay attack poses the biggest threat due to its simplicity and effectiveness. This thesis investigates the importance of 1) improving front-end feature extraction via novel feature extraction techniques and 2) enhancing spectral components via adaptive front-end frameworks to improve replay attack detection. This thesis initially focuses on AM-FM modelling techniques and their use in replay attack detection. A novel method to extract the sub-band frequency modulation (FM) component using the spectral centroid of a signal is proposed, and its use as a potential acoustic feature is also discussed. Frequency Domain Linear Prediction (FDLP) is explored as a method to obtain the temporal envelope of a speech signal. The temporal envelope carries amplitude modulation (AM) information of speech resonances. Several features are extracted from the temporal envelope and the FDLP residual signal. These features are then evaluated for replay attack detection and shown to have significant capability in discriminating genuine and spoofed signals. Fusion of AM and FM-based features has shown that AM and FM carry complementary information that helps distinguish replayed signals from genuine ones. The importance of frequency band allocation when creating filter banks is studied as well to further advance the understanding of front-ends for replay attack detection. Mechanisms inspired by the human auditory system that makes the human ear an excellent spectrum analyser have been investigated and integrated into front-ends. Spatial differentiation, a mechanism that provides additional sharpening to auditory filters is one of them that is used in this work to improve the selectivity of the sub-band decomposition filters. Two features are extracted using the improved filter bank front-end: spectral envelope centroid magnitude (SECM) and spectral envelope centroid frequency (SECF). These are used to establish the positive effect of spatial differentiation on discriminating spoofed signals. Level-dependent filter tuning, which allows the ear to handle a large dynamic range, is integrated into the filter bank to further improve the front-end. This mechanism converts the filter bank into an adaptive one where the selectivity of the filters is varied based on the input signal energy. Experimental results show that this leads to improved spoofing detection performance. Finally, deep neural network (DNN) mechanisms are integrated into sub-band feature extraction to develop an adaptive front-end that adjusts its characteristics based on the sub-band signals. A DNN-based controller that takes sub-band FM components as input, is developed to adaptively control the selectivity and sensitivity of a parallel filter bank to enhance the artifacts that differentiate a replayed signal from a genuine signal. This work illustrates gradient-based optimization of a DNN-based controller using the feedback from a spoofing detection back-end classifier, thus training it to reduce spoofing detection error. The proposed framework has displayed a superior ability in identifying high-quality replayed signals compared to conventional non-adaptive frameworks. All techniques proposed in this thesis have been evaluated on well-established databases on replay attack detection and compared with state-of-the-art baseline systems

UNSWorks

Multibiometric security in wireless communication systems

Author: Sepasian Mojtaba
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University, 05/08/2010.This thesis has aimed to explore an application of Multibiometrics to secured wireless communications. The medium of study for this purpose included Wi-Fi, 3G, and WiMAX, over which simulations and experimental studies were carried out to assess the performance. In specific, restriction of access to authorized users only is provided by a technique referred to hereafter as multibiometric cryptosystem. In brief, the system is built upon a complete challenge/response methodology in order to obtain a high level of security on the basis of user identification by fingerprint and further confirmation by verification of the user through text-dependent speaker recognition. First is the enrolment phase by which the database of watermarked fingerprints with memorable texts along with the voice features, based on the same texts, is created by sending them to the server through wireless channel. Later is the verification stage at which claimed users, ones who claim are genuine, are verified against the database, and it consists of five steps. Initially faced by the identification level, one is asked to first present one’s fingerprint and a memorable word, former is watermarked into latter, in order for system to authenticate the fingerprint and verify the validity of it by retrieving the challenge for accepted user. The following three steps then involve speaker recognition including the user responding to the challenge by text-dependent voice, server authenticating the response, and finally server accepting/rejecting the user. In order to implement fingerprint watermarking, i.e. incorporating the memorable word as a watermark message into the fingerprint image, an algorithm of five steps has been developed. The first three novel steps having to do with the fingerprint image enhancement (CLAHE with 'Clip Limit', standard deviation analysis and sliding neighborhood) have been followed with further two steps for embedding, and extracting the watermark into the enhanced fingerprint image utilising Discrete Wavelet Transform (DWT). In the speaker recognition stage, the limitations of this technique in wireless communication have been addressed by sending voice feature (cepstral coefficients) instead of raw sample. This scheme is to reap the advantages of reducing the transmission time and dependency of the data on communication channel, together with no loss of packet. Finally, the obtained results have verified the claims

CiteSeerX

Brunel University Research Archive

Unconditionally Secure Authentication and Integrity Protection for the Galileo Open Service Signal

Author
Publication venue
Publication date
Field of study

The operational GNSSs do not offer authentication and integrity protection for the Open Service (OS) signal/message. But it is urgently needed, since several attacks can threat the OS user. By this reason the Galileo GNSS is working on this issue. This thesis contributes at the problem by adopting an approach as generic as possible, which outlines a theoretical bound on the key size. Therefore, the focus is providing data and signal unconditionally secure authentication and integrity pro

Padua Thesis and Dissertation Archive

Advanced Biometrics with Deep Learning

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

Biometrics, such as fingerprint, iris, face, hand print, hand vein, speech and gait recognition, etc., as a means of identity management have become commonplace nowadays for various applications. Biometric systems follow a typical pipeline, that is composed of separate preprocessing, feature extraction and classification. Deep learning as a data-driven representation learning approach has been shown to be a promising alternative to conventional data-agnostic and handcrafted pre-processing and feature extraction for biometric systems. Furthermore, deep learning offers an end-to-end learning paradigm to unify preprocessing, feature extraction, and recognition, based solely on biometric data. This Special Issue has collected 12 high-quality, state-of-the-art research papers that deal with challenging issues in advanced biometric systems based on deep learning. The 12 papers can be divided into 4 categories according to biometric modality; namely, face biometrics, medical electronic signals (EEG and ECG), voice print, and others

Directory of Open Access Books (DOAB)

Aeronautical engineering: A continuing bibliography with indexes, supplement 100

Author
Publication venue
Publication date
Field of study

This bibliography lists 295 reports, articles, and other documents introduced into the NASA Scientific and Technical Information System in August 1978

NASA Technical Reports Server

A Survey on Wireless Security: Technical Challenges, Recent Advances and Future Trends

Author: Hanzo Lajos
Wang Xianbin
Zhu Jia
Zou Yulong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/04/2016
Field of study

This paper examines the security vulnerabilities and threats imposed by the inherent open nature of wireless communications and to devise efficient defense mechanisms for improving the wireless network security. We first summarize the security requirements of wireless networks, including their authenticity, confidentiality, integrity and availability issues. Next, a comprehensive overview of security attacks encountered in wireless networks is presented in view of the network protocol architecture, where the potential security threats are discussed at each protocol layer. We also provide a survey of the existing security protocols and algorithms that are adopted in the existing wireless network standards, such as the Bluetooth, Wi-Fi, WiMAX, and the long-term evolution (LTE) systems. Then, we discuss the state-of-the-art in physical-layer security, which is an emerging technique of securing the open communications environment against eavesdropping attacks at the physical layer. We also introduce the family of various jamming attacks and their counter-measures, including the constant jammer, intermittent jammer, reactive jammer, adaptive jammer and intelligent jammer. Additionally, we discuss the integration of physical-layer security into existing authentication and cryptography mechanisms for further securing wireless networks. Finally, some technical challenges which remain unresolved at the time of writing are summarized and the future trends in wireless security are discussed.Comment: 36 pages. Accepted to Appear in Proceedings of the IEEE, 201

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Crossref

Investigation of Diode Pumped Alkali Laser Atmospheric Transmission Using Tunable Diode Laser Absorption Spectroscopy

Author: Rice Christopher A.
Publication venue: AFIT Scholar
Publication date: 15/09/2012
Field of study

A field deployable, tunable diode laser absorption spectroscopy (TDLAS) device fiber coupled to a pair of 12.5 Ritchey-Chretien telescopes was used to study atmospheric propagation for open path lengths of 100 m to 1,000 m to estimate atmospheric transmission at key High Energy Laser (HEL) wavelengths. The potassium (K) version of the Diode Pumped Alkali Laser (DPAL) operates in between two of the sharp oxygen rotational features in the PP and the PQ branches. Initial experiments were performed in the vicinity of molecular oxygen X3Σ-g to b1Σ+g electronic transition lines near the potassium emission line at 770 nm. More than 50 rotational lines in the molecular oxygen A-band X3Σ-g to b1Σ+g transition near 760 nm were observed. Temperatures were determined from the Boltzmann rotational distribution to within 1.3% (less than 2 K). Oxygen concentration was obtained from the integrated spectral area of the absorption features to within 1.6% (less than 0.04 x 1018 molecules / cm3). Pressure was determined independently from the pressure-broadened Voigt lineshapes to within 10%

AFTI Scholar (Air Force Institute of Technology)