Search CORE

410 research outputs found

Acoustic-channel attack and defence methods for personal voice assistants

Author: Cheng Peng
Publication venue: Lancaster University
Publication date: 01/01/2020
Field of study

Personal Voice Assistants (PVAs) are increasingly used as interface to digital environments. Voice commands are used to interact with phones, smart homes or cars. In the US alone the number of smart speakers such as Amazon’s Echo and Google Home has grown by 78% to 118.5 million and 21% of the US population own at least one device. Given the increasing dependency of society on PVAs, security and privacy of these has become a major concern of users, manufacturers and policy makers. Consequently, a steep increase in research efforts addressing security and privacy of PVAs can be observed in recent years. While some security and privacy research applicable to the PVA domain predates their recent increase in popularity and many new research strands have emerged, there lacks research dedicated to PVA security and privacy. The most important interaction interface between users and a PVA is the acoustic channel and acoustic channel related security and privacy studies are desirable and required. The aim of the work presented in this thesis is to enhance the cognition of security and privacy issues of PVA usage related to the acoustic channel, to propose principles and solutions to key usage scenarios to mitigate potential security threats, and to present a novel type of dangerous attack which can be launched only by using a PVA alone. The five core contributions of this thesis are: (i) a taxonomy is built for the research domain of PVA security and privacy issues related to acoustic channel. An extensive research overview on the state of the art is provided, describing a comprehensive research map for PVA security and privacy. It is also shown in this taxonomy where the contributions of this thesis lie; (ii) Work has emerged aiming to generate adversarial audio inputs which sound harmless to humans but can trick a PVA to recognise harmful commands. The majority of work has been focused on the attack side, but there rarely exists work on how to defend against this type of attack. A defence method against white-box adversarial commands is proposed and implemented as a prototype. It is shown that a defence Automatic Speech Recognition (ASR) can work in parallel with the PVA’s main one, and adversarial audio input is detected if the difference in the speech decoding results between both ASR surpasses a threshold. It is demonstrated that an ASR that differs in architecture and/or training data from the the PVA’s main ASR is usable as protection ASR; (iii) PVAs continuously monitor conversations which may be transported to a cloud back end where they are stored, processed and maybe even passed on to other service providers. A user has limited control over this process when a PVA is triggered without user’s intent or a PVA belongs to others. A user is unable to control the recording behaviour of surrounding PVAs, unable to signal privacy requirements and unable to track conversation recordings. An acoustic tagging solution is proposed aiming to embed additional information into acoustic signals processed by PVAs. A user employs a tagging device which emits an acoustic signal when PVA activity is assumed. Any active PVA will embed this tag into their recorded audio stream. The tag may signal a cooperating PVA or back-end system that a user has not given a recording consent. The tag may also be used to trace when and where a recording was taken if necessary. A prototype tagging device based on PocketSphinx is implemented. Using Google Home Mini as the PVA, it is demonstrated that the device can tag conversations and the tagging signal can be retrieved from conversations stored in the Google back-end system; (iv) Acoustic tagging provides users the capability to signal their permission to the back-end PVA service, and another solution inspired by Denial of Service (DoS) is proposed as well for protecting user privacy. Although PVAs are very helpful, they are also continuously monitoring conversations. When a PVA detects a wake word, the immediately following conversation is recorded and transported to a cloud system for further analysis. An active protection mechanism is proposed: reactive jamming. A Protection Jamming Device (PJD) is employed to observe conversations. Upon detection of a PVA wake word the PJD emits an acoustic jamming signal. The PJD must detect the wake word faster than the PVA such that the jamming signal still prevents wake word detection by the PVA. An evaluation of the effectiveness of different jamming signals and overlap between wake words and the jamming signals is carried out. 100% jamming success can be achieved with an overlap of at least 60% with a negligible false positive rate; (v) Acoustic components (speakers and microphones) on a PVA can potentially be re-purposed to achieve acoustic sensing. This has great security and privacy implication due to the key role of PVAs in digital environments. The first active acoustic side-channel attack is proposed. Speakers are used to emit human inaudible acoustic signals and the echo is recorded via microphones, turning the acoustic system of a smartphone into a sonar system. The echo signal can be used to profile user interaction with the device. For example, a victim’s finger movement can be monitored to steal Android unlock patterns. The number of candidate unlock patterns that an attacker must try to authenticate herself to a Samsung S4 phone can be reduced by up to 70% using this novel unnoticeable acoustic side-channel

Lancaster E-Prints

Security and Privacy Problems in Voice Assistant Applications: A Survey

Author: Azghadi Mostafa Rahimi
chen Chao
Ghodosi Hossein
Li Jingjin
Pan Lei
Zhang Jun
Publication venue
Publication date: 19/04/2023
Field of study

Voice assistant applications have become omniscient nowadays. Two models that provide the two most important functions for real-life applications (i.e., Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR) models and Speaker Identification (SI) models. According to recent studies, security and privacy threats have also emerged with the rapid development of the Internet of Things (IoT). The security issues researched include attack techniques toward machine learning models and other hardware components widely used in voice assistant applications. The privacy issues include technical-wise information stealing and policy-wise privacy breaches. The voice assistant application takes a steadily growing market share every year, but their privacy and security issues never stopped causing huge economic losses and endangering users' personal sensitive information. Thus, it is important to have a comprehensive survey to outline the categorization of the current research regarding the security and privacy problems of voice assistant applications. This paper concludes and assesses five kinds of security attacks and three types of privacy threats in the papers published in the top-tier conferences of cyber security and voice domain.Comment: 5 figure

arXiv.org e-Print Archive

Security and privacy problems in voice assistant applications: A survey

Author: Chen Chao
Ghodosi Hossein
Li Jingjin
Pan Lei
Rahimi Azghadi Mostafa
Zhang Jun
Publication venue: Elsevier
Publication date: 01/01/2023
Field of study

ResearchOnline at James Cook University

Analysing and Preventing Self-Issued Voice Commands

Author: Esposito Sergio
Publication venue
Publication date: 01/01/2023
Field of study

Royal Holloway - Pure

Computational Intelligence and Human- Computer Interaction: Modern Methods and Applications

Author
Publication venue: 'MDPI AG'
Publication date: 21/06/2022
Field of study

The present book contains all of the articles that were accepted and published in the Special Issue of MDPI’s journal Mathematics titled "Computational Intelligence and Human–Computer Interaction: Modern Methods and Applications". This Special Issue covered a wide range of topics connected to the theory and application of different computational intelligence techniques to the domain of human–computer interaction, such as automatic speech recognition, speech processing and analysis, virtual reality, emotion-aware applications, digital storytelling, natural language processing, smart cars and devices, and online learning. We hope that this book will be interesting and useful for those working in various areas of artificial intelligence, human–computer interaction, and software engineering as well as for those who are interested in how these domains are connected in real-life situations

Directory of Open Access Books (DOAB)

A privacy-preserving AI-based Intent Recognition engine with Probabilistic Spell-Editing for an Italian Smart Home Voice Assistant

Author: Persico Paola
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 12/10/2023
Field of study

Negli ultimi decenni, il mercato dei dispositivi per la Smart Home si è espanso notevolmente. Tra le varie interfacce che permettono di inviare comandi a questi dispositivi, è di particolare interesse quella fornita dagli assistenti virtuali, testuali e/o vocali, soprattutto in quanto capace di offrire più indipendenza alle persone con disabilità e alle persone anziane, gruppo in aumento significativo in Italia. Purtroppo le soluzioni attuali sul mercato, come gli smart speaker, sono basate sull'invio dei comandi a server remoti, facendo sorgere preoccupazioni più o meno legittime riguardo la privacy. Le alternative open-source attualmente disponibili, di contro, sono poco accurate per la lingua italiana. L’obiettivo di questa tesi è di sviluppare un nuovo motore di Intent Recognition, chiamato Converso, per assistenti domotici in lingua italiana che possono essere integrati in piattaforme locali come Home Assistant. Per raggiungere quest'obiettivo, è stato generato un dataset sintetico, pre-processato tramite embedding Word2Vec, per addestrare modelli di Machine Learning per la classificazione degli Intent e degli slot; inoltre, è stato sviluppato un algoritmo basato su N-grammi per correggere gli errori ortografici o di riconoscimento vocale. L’agente di conversazione derivante, che si serve di una Support Vector Machine e non richiede alcuna connessione a server remoti, è stato valutato con un esperimento in condizioni realistiche, dimostrando un'accuratezza superiore al 60%

AMS Tesi di Laurea

Your Voice Gave You Away: the Privacy Risks of Voice-Inferred Information

Author: Ritter Emma
Publication venue: Duke University School of Law
Publication date: 18/11/2021
Field of study

Our voices can reveal intimate details about our lives. Yet, many privacy discussions have focused on the threats from speaker recognition and speech recognition. This Note argues that this focus overlooks another privacy risk: voice-inferred information. This term describes non-obvious information drawn from voice data through a combination of machine learning, artificial intelligence, data mining, and natural language processing. Companies have latched onto voiceinferred information. Early adopters have applied the technology in situations as varied as lending risk analysis and hiring. Consumers may balk at such strategies, but the current United States privacy regime leaves voice insights unprotected. By applying a notice and consent privacy model via sector-specific statutes, the hodgepodge of U.S. federal privacy laws allows voice-inferred information to slip through the regulatory cracks. This Note reviews the current legal landscape and identifies existing gaps. It then suggests two solutions that balance voice privacy with technological innovation: purpose-based consent and independent data review boards. The first bolsters voice protection within the traditional notice and consent framework, while the second imagines a new protective scheme. Together, these solutions complement each other to afford the human voice the protection it deserves

bepress Legal Repository

Duke Law Scholarship Repository

Understanding and Securing Voice Assistant Applications

Author: Zhang Yangyong
Publication venue
Publication date: 20/12/2023
Field of study

Internet of Things (IoT) has evolved from a traditional sensor network to an increasingly cloud dependent ecosystem. This transition empowers IoT devices with abundant outsourced computational power. However, securing IoT devices is still a challenging task. The reason is that many IoT devices nowadays perform complicated tasks (e.g., voice assistants or VA) and are connected to different third parties. This research targets popular VA services such as Amazon Alexa and Google Assistant, which are rapidly appifying their platforms to allow a more flexible and diverse voice-controlled service experience. Unfortunately, third-party skills have been reportedly posing threats to user privacy and security. The goal of this research is to conduct a systematic security analysis for different stages of a VA system, i.e., acoustic channel, speech processing, intent extraction, and application processing. Moreover, based on the analysis, corresponding defense strategies are proposed and evaluated. First, I investigate speech re-use problems in the acoustic channel. I then propose a security overlay named AEOLUS to tackle the speech re-use threat. Second, I study the speech processing stage by evaluating adversarial attacks targeting VA’s speaker recognition systems. I present a novel attention-based audio perturbation scheme to help improve the efficiency and imperceptibility of generating audio adversarial examples. Third, I assess the intent extraction of VA to understand the root cause of semantic misinterpretation. A linguistic-guided fuzzing scheme is then proposed to evaluate the problem systematically in a large scale. Fourth, for VA application (or skill) processing stage, I conduct a user study with Alexa users to learn about how users perceive existing warning messages for voice assistant applications

Texas A&M Repository