Search CORE

32,462 research outputs found

Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues

Author: Gong Yuan
Poellabauer Christian
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/11/2018
Field of study

Over the last few years, a rapidly increasing number of Internet-of-Things (IoT) systems that adopt voice as the primary user input have emerged. These systems have been shown to be vulnerable to various types of voice spoofing attacks. Existing defense techniques can usually only protect from a specific type of attack or require an additional authentication step that involves another device. Such defense strategies are either not strong enough or lower the usability of the system. Based on the fact that legitimate voice commands should only come from humans rather than a playback device, we propose a novel defense strategy that is able to detect the sound source of a voice command based on its acoustic features. The proposed defense strategy does not require any information other than the voice command itself and can protect a system from multiple types of spoofing attacks. Our proof-of-concept experiments verify the feasibility and effectiveness of this defense strategy.Comment: Proceedings of the 27th International Conference on Computer Communications and Networks (ICCCN), Hangzhou, China, July-August 2018. arXiv admin note: text overlap with arXiv:1803.0915

arXiv.org e-Print Archive

Crossref

Universal Adversarial Perturbations for Speech Recognition Systems

Author: Dubnov Shlomo
Hussain Shehzeen
Koushanfar Farinaz
McAuley Julian
Neekhara Paarth
Pandey Prakhar
Publication venue
Publication date: 01/01/2019
Field of study

In this work, we demonstrate the existence of universal adversarial audio perturbations that cause mis-transcription of audio signals by automatic speech recognition (ASR) systems. We propose an algorithm to find a single quasi-imperceptible perturbation, which when added to any arbitrary speech signal, will most likely fool the victim speech recognition model. Our experiments demonstrate the application of our proposed technique by crafting audio-agnostic universal perturbations for the state-of-the-art ASR system -- Mozilla DeepSpeech. Additionally, we show that such perturbations generalize to a significant extent across models that are not available during training, by performing a transferability test on a WaveNet based ASR system.Comment: Published as a conference paper at INTERSPEECH 201

arXiv.org e-Print Archive

eScholarship - University of California