125 research outputs found
Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks
Voice cloning technologies have found applications in a variety of areas
ranging from personalized speech interfaces to advertisement, robotics, and so
on. Existing voice cloning systems are capable of learning speaker
characteristics and use trained models to synthesize a person's voice from only
a few audio samples. Advances in cloned speech generation technologies are
capable of generating perceptually indistinguishable speech from a bona-fide
speech. These advances pose new security and privacy threats to voice-driven
interfaces and speech-based access control systems. The state-of-the-art speech
synthesis technologies use trained or tuned generative models for cloned speech
generation. Trained generative models rely on linear operations, learned
weights, and excitation source for cloned speech synthesis. These systems leave
characteristic artifacts in the synthesized speech. Higher-order spectral
analysis is used to capture differentiating attributes between bona-fide and
cloned audios. Specifically, quadrature phase coupling (QPC) in the estimated
bicoherence, Gaussianity test statistics, and linearity test statistics are
used to capture generative model artifacts. Performance of the proposed method
is evaluated on cloned audios generated using speaker adaptation- and speaker
encoding-based approaches. Experimental results for a dataset consisting of 126
cloned speech and 8 bona-fide speech samples indicate that the proposed method
is capable of detecting bona-fide and cloned audios with close to a perfect
detection rate.Comment: 6 pages, The 2nd IEEE International Workshop on "Fake MultiMedia"
(FakeMM'19) March 28-30, 2019, San Jose, CA, US
Faked Speech Detection with Zero Knowledge
Audio is one of the most used ways of human communication, but at the same
time it can be easily misused to trick people. With the revolution of AI, the
related technologies are now accessible to almost everyone thus making it
simple for the criminals to commit crimes and forgeries. In this work, we
introduce a neural network method to develop a classifier that will blindly
classify an input audio as real or mimicked; the word 'blindly' refers to the
ability to detect mimicked audio without references or real sources. The
proposed model was trained on a set of important features extracted from a
large dataset of audios to get a classifier that was tested on the same set of
features from different audios. The data was extracted from two raw datasets,
especially composed for this work; an all English dataset and a mixed dataset
(Arabic plus English). These datasets have been made available, in raw form,
through GitHub for the use of the research community at
https://github.com/SaSs7/Dataset. For the purpose of comparison, the audios
were also classified through human inspection with the subjects being the
native speakers. The ensued results were interesting and exhibited formidable
accuracy.Comment: 14 pages, 4 figures (6 if you count subfigures), 2 table
Securing voice communications using audio steganography
Although authentication of users of digital voice-based systems has been addressed by much research and many commercially available products, there are very few that perform well in terms of both usability and security in the audio domain. In addition, the use of voice biometrics has been shown to have limitations and relatively poor performance when compared to other authentication methods. We propose using audio steganography as a method of placing authentication key material into sound, such that an authentication factor can be achieved within an audio channel to supplement other methods, thus providing a multi factor authentication opportunity that retains the usability associated with voice channels. In this research we outline the challenges and threats to audio and voice-based systems in the form of an original threat model focusing on audio and voice-based systems, we outline a novel architectural model that utilises audio steganography to mitigate the threats in various authentication scenarios and finally, we conduct experimentation into hiding authentication materials into an audible sound. The experimentation focused on creating and testing a new steganographic technique which is robust to noise, resilient to steganalysis and has sufficient capacity to hold cryptographic material such as a 2048 bit RSA key in a short audio music clip of just a few seconds achieving a signal to noise ratio of over 70 dB in some scenarios. The method developed was seen to be very robust using digital transmission which has applications beyond this research. With acoustic transmission, despite the progress demonstrated in this research some challenges remain to ensure the approach achieves its full potential in noisy real-world applications and therefore the future research direction required is outlined and discussed
TIMIT-TTS: a Text-to-Speech Dataset for Multimodal Synthetic Media Detection
With the rapid development of deep learning techniques, the generation and
counterfeiting of multimedia material are becoming increasingly straightforward
to perform. At the same time, sharing fake content on the web has become so
simple that malicious users can create unpleasant situations with minimal
effort. Also, forged media are getting more and more complex, with manipulated
videos that are taking the scene over still images. The multimedia forensic
community has addressed the possible threats that this situation could imply by
developing detectors that verify the authenticity of multimedia objects.
However, the vast majority of these tools only analyze one modality at a time.
This was not a problem as long as still images were considered the most widely
edited media, but now, since manipulated videos are becoming customary,
performing monomodal analyses could be reductive. Nonetheless, there is a lack
in the literature regarding multimodal detectors, mainly due to the scarsity of
datasets containing forged multimodal data to train and test the designed
algorithms. In this paper we focus on the generation of an audio-visual
deepfake dataset. First, we present a general pipeline for synthesizing speech
deepfake content from a given real or fake video, facilitating the creation of
counterfeit multimodal material. The proposed method uses Text-to-Speech (TTS)
and Dynamic Time Warping techniques to achieve realistic speech tracks. Then,
we use the pipeline to generate and release TIMIT-TTS, a synthetic speech
dataset containing the most cutting-edge methods in the TTS field. This can be
used as a standalone audio dataset, or combined with other state-of-the-art
sets to perform multimodal research. Finally, we present numerous experiments
to benchmark the proposed dataset in both mono and multimodal conditions,
showing the need for multimodal forensic detectors and more suitable data
The audio auditor: user-level membership inference in Internet of Things voice services
With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this paper, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system. Finally, we conduct a real-world trial of our auditor on iPhone Siri, achieving an overall accuracy exceeding 80%. We hope the methodology developed in this paper and findings can inform privacy advocates to overhaul IoT privacy
Recommended from our members
Availability, Integrity, and Confidentiality for Content Centric Network internet architectures
The Internet as we know it today, despite being ``the result of a series of accidents of choices'' in Prof. Jon Crowcroft's words, has undoubtedly been an amazing success story. However, it has been constantly challenged by the demands of the overwhelming evolution of data traffic types, non-functional needs of applications and users, and device diversity. The phrase ``future internet architecture'' can be interpreted as referring to a revised set of design principles. As Dr David Clark rightfully suggested, we need to ``allow for the future in the face of the present''. Content Centric Networking (CCN) is one of the candidates for a future internet architecture. Security is one of the most significant considerations while designing a future internet architecture. Availability, Integrity, and Confidentiality (AIC) are considered the three most crucial components of security: 1) availability is the assurance of continuous, reliable, and uninterrupted access to the information by authorized people, 2) integrity is the preservation of information and prevention of any change in it caused via accident or malicious intent, and 3) confidentiality is the ability to keep the information secret from unintended audience, intruders, and adversaries. This thesis discusses AIC related security threats and corresponding remedies for Named Data Networking (NDN) which is a promising example of CCN. It also presents a system dynamics modelling approach to bridge the gap between the technical solutions and business strategy by quantifying some of the qualitative variables salient to technology architects, policymakers, lawmakers, regulators, and internet service providers for the design of a future-proof internet architecture
Cybersecurity: Past, Present and Future
The digital transformation has created a new digital space known as
cyberspace. This new cyberspace has improved the workings of businesses,
organizations, governments, society as a whole, and day to day life of an
individual. With these improvements come new challenges, and one of the main
challenges is security. The security of the new cyberspace is called
cybersecurity. Cyberspace has created new technologies and environments such as
cloud computing, smart devices, IoTs, and several others. To keep pace with
these advancements in cyber technologies there is a need to expand research and
develop new cybersecurity methods and tools to secure these domains and
environments. This book is an effort to introduce the reader to the field of
cybersecurity, highlight current issues and challenges, and provide future
directions to mitigate or resolve them. The main specializations of
cybersecurity covered in this book are software security, hardware security,
the evolution of malware, biometrics, cyber intelligence, and cyber forensics.
We must learn from the past, evolve our present and improve the future. Based
on this objective, the book covers the past, present, and future of these main
specializations of cybersecurity. The book also examines the upcoming areas of
research in cyber intelligence, such as hybrid augmented and explainable
artificial intelligence (AI). Human and AI collaboration can significantly
increase the performance of a cybersecurity system. Interpreting and explaining
machine learning models, i.e., explainable AI is an emerging field of study and
has a lot of potentials to improve the role of AI in cybersecurity.Comment: Author's copy of the book published under ISBN: 978-620-4-74421-
Project BeARCAT : Baselining, Automation and Response for CAV Testbed Cyber Security : Connected Vehicle & Infrastructure Security Assessment
Connected, software-based systems are a driver in advancing the technology of transportation systems. Advanced automated and autonomous vehicles, together with electrification, will help reduce congestion, accidents and emissions. Meanwhile, vehicle manufacturers see advanced technology as enhancing their products in a competitive market. However, as many decades of using home and enterprise computer systems have shown, connectivity allows a system to become a target for criminal intentions. Cyber-based threats to any system are a problem; in transportation, there is the added safety implication of dealing with moving vehicles and the passengers within
The Proceedings of 15th Australian Information Security Management Conference, 5-6 December, 2017, Edith Cowan University, Perth, Australia
Conference Foreword
The annual Security Congress, run by the Security Research Institute at Edith Cowan University, includes the Australian Information Security and Management Conference. Now in its fifteenth year, the conference remains popular for its diverse content and mixture of technical research and discussion papers. The area of information security and management continues to be varied, as is reflected by the wide variety of subject matter covered by the papers this year. The papers cover topics from vulnerabilities in āInternet of Thingsā protocols through to improvements in biometric identification algorithms and surveillance camera weaknesses. The conference has drawn interest and papers from within Australia and internationally. All submitted papers were subject to a double blind peer review process. Twenty two papers were submitted from Australia and overseas, of which eighteen were accepted for final presentation and publication. We wish to thank the reviewers for kindly volunteering their time and expertise in support of this event. We would also like to thank the conference committee who have organised yet another successful congress. Events such as this are impossible without the tireless efforts of such people in reviewing and editing the conference papers, and assisting with the planning, organisation and execution of the conference. To our sponsors, also a vote of thanks for both the financial and moral support provided to the conference. Finally, thank you to the administrative and technical staff, and students of the ECU Security Research Institute for their contributions to the running of the conference
- ā¦