125 research outputs found

    Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks

    Full text link
    Voice cloning technologies have found applications in a variety of areas ranging from personalized speech interfaces to advertisement, robotics, and so on. Existing voice cloning systems are capable of learning speaker characteristics and use trained models to synthesize a person's voice from only a few audio samples. Advances in cloned speech generation technologies are capable of generating perceptually indistinguishable speech from a bona-fide speech. These advances pose new security and privacy threats to voice-driven interfaces and speech-based access control systems. The state-of-the-art speech synthesis technologies use trained or tuned generative models for cloned speech generation. Trained generative models rely on linear operations, learned weights, and excitation source for cloned speech synthesis. These systems leave characteristic artifacts in the synthesized speech. Higher-order spectral analysis is used to capture differentiating attributes between bona-fide and cloned audios. Specifically, quadrature phase coupling (QPC) in the estimated bicoherence, Gaussianity test statistics, and linearity test statistics are used to capture generative model artifacts. Performance of the proposed method is evaluated on cloned audios generated using speaker adaptation- and speaker encoding-based approaches. Experimental results for a dataset consisting of 126 cloned speech and 8 bona-fide speech samples indicate that the proposed method is capable of detecting bona-fide and cloned audios with close to a perfect detection rate.Comment: 6 pages, The 2nd IEEE International Workshop on "Fake MultiMedia" (FakeMM'19) March 28-30, 2019, San Jose, CA, US

    Faked Speech Detection with Zero Knowledge

    Full text link
    Audio is one of the most used ways of human communication, but at the same time it can be easily misused to trick people. With the revolution of AI, the related technologies are now accessible to almost everyone thus making it simple for the criminals to commit crimes and forgeries. In this work, we introduce a neural network method to develop a classifier that will blindly classify an input audio as real or mimicked; the word 'blindly' refers to the ability to detect mimicked audio without references or real sources. The proposed model was trained on a set of important features extracted from a large dataset of audios to get a classifier that was tested on the same set of features from different audios. The data was extracted from two raw datasets, especially composed for this work; an all English dataset and a mixed dataset (Arabic plus English). These datasets have been made available, in raw form, through GitHub for the use of the research community at https://github.com/SaSs7/Dataset. For the purpose of comparison, the audios were also classified through human inspection with the subjects being the native speakers. The ensued results were interesting and exhibited formidable accuracy.Comment: 14 pages, 4 figures (6 if you count subfigures), 2 table

    Securing voice communications using audio steganography

    Get PDF
    Although authentication of users of digital voice-based systems has been addressed by much research and many commercially available products, there are very few that perform well in terms of both usability and security in the audio domain. In addition, the use of voice biometrics has been shown to have limitations and relatively poor performance when compared to other authentication methods. We propose using audio steganography as a method of placing authentication key material into sound, such that an authentication factor can be achieved within an audio channel to supplement other methods, thus providing a multi factor authentication opportunity that retains the usability associated with voice channels. In this research we outline the challenges and threats to audio and voice-based systems in the form of an original threat model focusing on audio and voice-based systems, we outline a novel architectural model that utilises audio steganography to mitigate the threats in various authentication scenarios and finally, we conduct experimentation into hiding authentication materials into an audible sound. The experimentation focused on creating and testing a new steganographic technique which is robust to noise, resilient to steganalysis and has sufficient capacity to hold cryptographic material such as a 2048 bit RSA key in a short audio music clip of just a few seconds achieving a signal to noise ratio of over 70 dB in some scenarios. The method developed was seen to be very robust using digital transmission which has applications beyond this research. With acoustic transmission, despite the progress demonstrated in this research some challenges remain to ensure the approach achieves its full potential in noisy real-world applications and therefore the future research direction required is outlined and discussed

    TIMIT-TTS: a Text-to-Speech Dataset for Multimodal Synthetic Media Detection

    Get PDF
    With the rapid development of deep learning techniques, the generation and counterfeiting of multimedia material are becoming increasingly straightforward to perform. At the same time, sharing fake content on the web has become so simple that malicious users can create unpleasant situations with minimal effort. Also, forged media are getting more and more complex, with manipulated videos that are taking the scene over still images. The multimedia forensic community has addressed the possible threats that this situation could imply by developing detectors that verify the authenticity of multimedia objects. However, the vast majority of these tools only analyze one modality at a time. This was not a problem as long as still images were considered the most widely edited media, but now, since manipulated videos are becoming customary, performing monomodal analyses could be reductive. Nonetheless, there is a lack in the literature regarding multimodal detectors, mainly due to the scarsity of datasets containing forged multimodal data to train and test the designed algorithms. In this paper we focus on the generation of an audio-visual deepfake dataset. First, we present a general pipeline for synthesizing speech deepfake content from a given real or fake video, facilitating the creation of counterfeit multimodal material. The proposed method uses Text-to-Speech (TTS) and Dynamic Time Warping techniques to achieve realistic speech tracks. Then, we use the pipeline to generate and release TIMIT-TTS, a synthetic speech dataset containing the most cutting-edge methods in the TTS field. This can be used as a standalone audio dataset, or combined with other state-of-the-art sets to perform multimodal research. Finally, we present numerous experiments to benchmark the proposed dataset in both mono and multimodal conditions, showing the need for multimodal forensic detectors and more suitable data

    The audio auditor: user-level membership inference in Internet of Things voice services

    Get PDF
    With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this paper, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system. Finally, we conduct a real-world trial of our auditor on iPhone Siri, achieving an overall accuracy exceeding 80%. We hope the methodology developed in this paper and findings can inform privacy advocates to overhaul IoT privacy

    Cybersecurity: Past, Present and Future

    Full text link
    The digital transformation has created a new digital space known as cyberspace. This new cyberspace has improved the workings of businesses, organizations, governments, society as a whole, and day to day life of an individual. With these improvements come new challenges, and one of the main challenges is security. The security of the new cyberspace is called cybersecurity. Cyberspace has created new technologies and environments such as cloud computing, smart devices, IoTs, and several others. To keep pace with these advancements in cyber technologies there is a need to expand research and develop new cybersecurity methods and tools to secure these domains and environments. This book is an effort to introduce the reader to the field of cybersecurity, highlight current issues and challenges, and provide future directions to mitigate or resolve them. The main specializations of cybersecurity covered in this book are software security, hardware security, the evolution of malware, biometrics, cyber intelligence, and cyber forensics. We must learn from the past, evolve our present and improve the future. Based on this objective, the book covers the past, present, and future of these main specializations of cybersecurity. The book also examines the upcoming areas of research in cyber intelligence, such as hybrid augmented and explainable artificial intelligence (AI). Human and AI collaboration can significantly increase the performance of a cybersecurity system. Interpreting and explaining machine learning models, i.e., explainable AI is an emerging field of study and has a lot of potentials to improve the role of AI in cybersecurity.Comment: Author's copy of the book published under ISBN: 978-620-4-74421-

    Project BeARCAT : Baselining, Automation and Response for CAV Testbed Cyber Security : Connected Vehicle & Infrastructure Security Assessment

    Get PDF
    Connected, software-based systems are a driver in advancing the technology of transportation systems. Advanced automated and autonomous vehicles, together with electrification, will help reduce congestion, accidents and emissions. Meanwhile, vehicle manufacturers see advanced technology as enhancing their products in a competitive market. However, as many decades of using home and enterprise computer systems have shown, connectivity allows a system to become a target for criminal intentions. Cyber-based threats to any system are a problem; in transportation, there is the added safety implication of dealing with moving vehicles and the passengers within

    The Proceedings of 15th Australian Information Security Management Conference, 5-6 December, 2017, Edith Cowan University, Perth, Australia

    Get PDF
    Conference Foreword The annual Security Congress, run by the Security Research Institute at Edith Cowan University, includes the Australian Information Security and Management Conference. Now in its fifteenth year, the conference remains popular for its diverse content and mixture of technical research and discussion papers. The area of information security and management continues to be varied, as is reflected by the wide variety of subject matter covered by the papers this year. The papers cover topics from vulnerabilities in ā€œInternet of Thingsā€ protocols through to improvements in biometric identification algorithms and surveillance camera weaknesses. The conference has drawn interest and papers from within Australia and internationally. All submitted papers were subject to a double blind peer review process. Twenty two papers were submitted from Australia and overseas, of which eighteen were accepted for final presentation and publication. We wish to thank the reviewers for kindly volunteering their time and expertise in support of this event. We would also like to thank the conference committee who have organised yet another successful congress. Events such as this are impossible without the tireless efforts of such people in reviewing and editing the conference papers, and assisting with the planning, organisation and execution of the conference. To our sponsors, also a vote of thanks for both the financial and moral support provided to the conference. Finally, thank you to the administrative and technical staff, and students of the ECU Security Research Institute for their contributions to the running of the conference
    • ā€¦
    corecore