Search CORE

125 research outputs found

Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks

Author: Malik Hafiz
Publication venue
Publication date: 18/02/2019
Field of study

Voice cloning technologies have found applications in a variety of areas ranging from personalized speech interfaces to advertisement, robotics, and so on. Existing voice cloning systems are capable of learning speaker characteristics and use trained models to synthesize a person's voice from only a few audio samples. Advances in cloned speech generation technologies are capable of generating perceptually indistinguishable speech from a bona-fide speech. These advances pose new security and privacy threats to voice-driven interfaces and speech-based access control systems. The state-of-the-art speech synthesis technologies use trained or tuned generative models for cloned speech generation. Trained generative models rely on linear operations, learned weights, and excitation source for cloned speech synthesis. These systems leave characteristic artifacts in the synthesized speech. Higher-order spectral analysis is used to capture differentiating attributes between bona-fide and cloned audios. Specifically, quadrature phase coupling (QPC) in the estimated bicoherence, Gaussianity test statistics, and linearity test statistics are used to capture generative model artifacts. Performance of the proposed method is evaluated on cloned audios generated using speaker adaptation- and speaker encoding-based approaches. Experimental results for a dataset consisting of 126 cloned speech and 8 bona-fide speech samples indicate that the proposed method is capable of detecting bona-fide and cloned audios with close to a perfect detection rate.Comment: 6 pages, The 2nd IEEE International Workshop on "Fake MultiMedia" (FakeMM'19) March 28-30, 2019, San Jose, CA, US

arXiv.org e-Print Archive

Crossref

Faked Speech Detection with Zero Knowledge

Author: Ajmi Sahar Al
Hayat Khizar
Kumar Naresh
Magnier Baptiste
Najmuldeen Munaf
Obaidi Alaa M. Al
Publication venue
Publication date: 03/09/2023
Field of study

Audio is one of the most used ways of human communication, but at the same time it can be easily misused to trick people. With the revolution of AI, the related technologies are now accessible to almost everyone thus making it simple for the criminals to commit crimes and forgeries. In this work, we introduce a neural network method to develop a classifier that will blindly classify an input audio as real or mimicked; the word 'blindly' refers to the ability to detect mimicked audio without references or real sources. The proposed model was trained on a set of important features extracted from a large dataset of audios to get a classifier that was tested on the same set of features from different audios. The data was extracted from two raw datasets, especially composed for this work; an all English dataset and a mixed dataset (Arabic plus English). These datasets have been made available, in raw form, through GitHub for the use of the research community at https://github.com/SaSs7/Dataset. For the purpose of comparison, the audios were also classified through human inspection with the subjects being the native speakers. The ensued results were interesting and exhibited formidable accuracy.Comment: 14 pages, 4 figures (6 if you count subfigures), 2 table

arXiv.org e-Print Archive

Securing voice communications using audio steganography

Author: Ouazzane Karim
Phipps Anthony
Vassilev Vassil
Publication venue: Modern Education and Computer Science (MECS) Press
Publication date
Field of study

Although authentication of users of digital voice-based systems has been addressed by much research and many commercially available products, there are very few that perform well in terms of both usability and security in the audio domain. In addition, the use of voice biometrics has been shown to have limitations and relatively poor performance when compared to other authentication methods. We propose using audio steganography as a method of placing authentication key material into sound, such that an authentication factor can be achieved within an audio channel to supplement other methods, thus providing a multi factor authentication opportunity that retains the usability associated with voice channels. In this research we outline the challenges and threats to audio and voice-based systems in the form of an original threat model focusing on audio and voice-based systems, we outline a novel architectural model that utilises audio steganography to mitigate the threats in various authentication scenarios and finally, we conduct experimentation into hiding authentication materials into an audible sound. The experimentation focused on creating and testing a new steganographic technique which is robust to noise, resilient to steganalysis and has sufficient capacity to hold cryptographic material such as a 2048 bit RSA key in a short audio music clip of just a few seconds achieving a signal to noise ratio of over 70 dB in some scenarios. The method developed was seen to be very robust using digital transmission which has applications beyond this research. With acoustic transmission, despite the progress demonstrated in this research some challenges remain to ensure the approach achieves its full potential in noisy real-world applications and therefore the future research direction required is outlined and discussed

London Met Repository

TIMIT-TTS: a Text-to-Speech Dataset for Multimodal Synthetic Media Detection

Author: Bestagini Paolo
Hosler Brian
Salvi Davide
Stamm Matthew C.
Tubaro Stefano
Publication venue
Publication date: 16/09/2022
Field of study

With the rapid development of deep learning techniques, the generation and counterfeiting of multimedia material are becoming increasingly straightforward to perform. At the same time, sharing fake content on the web has become so simple that malicious users can create unpleasant situations with minimal effort. Also, forged media are getting more and more complex, with manipulated videos that are taking the scene over still images. The multimedia forensic community has addressed the possible threats that this situation could imply by developing detectors that verify the authenticity of multimedia objects. However, the vast majority of these tools only analyze one modality at a time. This was not a problem as long as still images were considered the most widely edited media, but now, since manipulated videos are becoming customary, performing monomodal analyses could be reductive. Nonetheless, there is a lack in the literature regarding multimodal detectors, mainly due to the scarsity of datasets containing forged multimodal data to train and test the designed algorithms. In this paper we focus on the generation of an audio-visual deepfake dataset. First, we present a general pipeline for synthesizing speech deepfake content from a given real or fake video, facilitating the creation of counterfeit multimodal material. The proposed method uses Text-to-Speech (TTS) and Dynamic Time Warping techniques to achieve realistic speech tracks. Then, we use the pipeline to generate and release TIMIT-TTS, a synthetic speech dataset containing the most cutting-edge methods in the TTS field. This can be used as a standalone audio dataset, or combined with other state-of-the-art sets to perform multimodal research. Finally, we present numerous experiments to benchmark the proposed dataset in both mono and multimodal conditions, showing the need for multimodal forensic detectors and more suitable data

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

The audio auditor: user-level membership inference in Internet of Things voice services

Author: Chen Chao
Kaafar Dali
Miao Yuantian
Minhui Xue
Pan Lei
Xiang Yang
Zhang Jun
Zhao Benjamin Zi Hao
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2021
Field of study

With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this paper, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system. Finally, we conduct a real-world trial of our auditor on iPhone Siri, achieving an overall accuracy exceeding 80%. We hope the methodology developed in this paper and findings can inform privacy advocates to overhaul IoT privacy

arXiv.org e-Print Archive

Deakin Research Online

ResearchOnline at James Cook University

Recommended from our members

Availability, Integrity, and Confidentiality for Content Centric Network internet architectures

Author: Hussain Mohibi
Publication venue: University of Cambridge
Publication date: 29/10/2019
Field of study

The Internet as we know it today, despite being ``the result of a series of accidents of choices'' in Prof. Jon Crowcroft's words, has undoubtedly been an amazing success story. However, it has been constantly challenged by the demands of the overwhelming evolution of data traffic types, non-functional needs of applications and users, and device diversity. The phrase ``future internet architecture'' can be interpreted as referring to a revised set of design principles. As Dr David Clark rightfully suggested, we need to ``allow for the future in the face of the present''. Content Centric Networking (CCN) is one of the candidates for a future internet architecture. Security is one of the most significant considerations while designing a future internet architecture. Availability, Integrity, and Confidentiality (AIC) are considered the three most crucial components of security: 1) availability is the assurance of continuous, reliable, and uninterrupted access to the information by authorized people, 2) integrity is the preservation of information and prevention of any change in it caused via accident or malicious intent, and 3) confidentiality is the ability to keep the information secret from unintended audience, intruders, and adversaries. This thesis discusses AIC related security threats and corresponding remedies for Named Data Networking (NDN) which is a promising example of CCN. It also presents a system dynamics modelling approach to bridge the gap between the technical solutions and business strategy by quantifying some of the qualitative variables salient to technology architects, policymakers, lawmakers, regulators, and internet service providers for the design of a future-proof internet architecture

Apollo (Cambridge)

Cybersecurity: Past, Present and Future

Author: Alam Shahid
Publication venue
Publication date: 19/10/2023
Field of study

The digital transformation has created a new digital space known as cyberspace. This new cyberspace has improved the workings of businesses, organizations, governments, society as a whole, and day to day life of an individual. With these improvements come new challenges, and one of the main challenges is security. The security of the new cyberspace is called cybersecurity. Cyberspace has created new technologies and environments such as cloud computing, smart devices, IoTs, and several others. To keep pace with these advancements in cyber technologies there is a need to expand research and develop new cybersecurity methods and tools to secure these domains and environments. This book is an effort to introduce the reader to the field of cybersecurity, highlight current issues and challenges, and provide future directions to mitigate or resolve them. The main specializations of cybersecurity covered in this book are software security, hardware security, the evolution of malware, biometrics, cyber intelligence, and cyber forensics. We must learn from the past, evolve our present and improve the future. Based on this objective, the book covers the past, present, and future of these main specializations of cybersecurity. The book also examines the upcoming areas of research in cyber intelligence, such as hybrid augmented and explainable artificial intelligence (AI). Human and AI collaboration can significantly increase the performance of a cybersecurity system. Interpreting and explaining machine learning models, i.e., explainable AI is an emerging field of study and has a lot of potentials to improve the role of AI in cybersecurity.Comment: Author's copy of the book published under ISBN: 978-620-4-74421-

arXiv.org e-Print Archive

Project BeARCAT : Baselining, Automation and Response for CAV Testbed Cyber Security : Connected Vehicle & Infrastructure Security Assessment

Author: Fowler Daniel S.
HASH(0x55cde5a76060)
HASH(0x55cde5ada160)
HASH(0x55cde5baae20)
HASH(0x55cde5ed55d0)
Publication venue: Zenzic
Publication date: 31/03/2020
Field of study

Connected, software-based systems are a driver in advancing the technology of transportation systems. Advanced automated and autonomous vehicles, together with electrification, will help reduce congestion, accidents and emissions. Meanwhile, vehicle manufacturers see advanced technology as enhancing their products in a competitive market. However, as many decades of using home and enterprise computer systems have shown, connectivity allows a system to become a target for criminal intentions. Cyber-based threats to any system are a problem; in transportation, there is the added safety implication of dealing with moving vehicles and the passengers within

Warwick Research Archives Portal Repository

The Proceedings of 15th Australian Information Security Management Conference, 5-6 December, 2017, Edith Cowan University, Perth, Australia

Author: Valli (Ed.) Craig
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2017
Field of study

Conference Foreword The annual Security Congress, run by the Security Research Institute at Edith Cowan University, includes the Australian Information Security and Management Conference. Now in its fifteenth year, the conference remains popular for its diverse content and mixture of technical research and discussion papers. The area of information security and management continues to be varied, as is reflected by the wide variety of subject matter covered by the papers this year. The papers cover topics from vulnerabilities in “Internet of Things” protocols through to improvements in biometric identification algorithms and surveillance camera weaknesses. The conference has drawn interest and papers from within Australia and internationally. All submitted papers were subject to a double blind peer review process. Twenty two papers were submitted from Australia and overseas, of which eighteen were accepted for final presentation and publication. We wish to thank the reviewers for kindly volunteering their time and expertise in support of this event. We would also like to thank the conference committee who have organised yet another successful congress. Events such as this are impossible without the tireless efforts of such people in reviewing and editing the conference papers, and assisting with the planning, organisation and execution of the conference. To our sponsors, also a vote of thanks for both the financial and moral support provided to the conference. Finally, thank you to the administrative and technical staff, and students of the ECU Security Research Institute for their contributions to the running of the conference

Research Online @ ECU