Search CORE

4 research outputs found

Weighted-Sampling Audio Adversarial Example Attack

Author: Ding Yufei
Liu Xiaolei
Wan Kun
Zhang Xiaosong
Zhu Qingxin
Publication venue
Publication date: 22/02/2020
Field of study

Recent studies have highlighted audio adversarial examples as a ubiquitous threat to state-of-the-art automatic speech recognition systems. Thorough studies on how to effectively generate adversarial examples are essential to prevent potential attacks. Despite many research on this, the efficiency and the robustness of existing works are not yet satisfactory. In this paper, we propose~\textit{weighted-sampling audio adversarial examples}, focusing on the numbers and the weights of distortion to reinforce the attack. Further, we apply a denoising method in the loss function to make the adversarial attack more imperceptible. Experiments show that our method is the first in the field to generate audio adversarial examples with low noise and high audio robustness at the minute time-consuming level.Comment: https://aaai.org/Papers/AAAI/2020GB/AAAI-LiuXL.9260.pd

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Phonemic Adversarial Attack against Audio Recognition in Real World

Author: Chen Zhendong
Liu Xianglong
Wang Jiakai
Yang Qinghong
Yin Zixin
Publication venue
Publication date: 19/11/2022
Field of study

Recently, adversarial attacks for audio recognition have attracted much attention. However, most of the existing studies mainly rely on the coarse-grain audio features at the instance level to generate adversarial noises, which leads to expensive generation time costs and weak universal attacking ability. Motivated by the observations that all audio speech consists of fundamental phonemes, this paper proposes a phonemic adversarial tack (PAT) paradigm, which attacks the fine-grain audio features at the phoneme level commonly shared across audio instances, to generate phonemic adversarial noises, enjoying the more general attacking ability with fast generation speed. Specifically, for accelerating the generation, a phoneme density balanced sampling strategy is introduced to sample quantity less but phonemic features abundant audio instances as the training data via estimating the phoneme density, which substantially alleviates the heavy dependency on the large training dataset. Moreover, for promoting universal attacking ability, the phonemic noise is optimized in an asynchronous way with a sliding window, which enhances the phoneme diversity and thus well captures the critical fundamental phonemic patterns. By conducting extensive experiments, we comprehensively investigate the proposed PAT framework and demonstrate that it outperforms the SOTA baselines by large margins (i.e., at least 11X speed up and 78% attacking ability improvement)

arXiv.org e-Print Archive

Privacy-preserving and Privacy-attacking Approaches for Speech and Audio -- A Survey

Author: Kapadia Apu
Liu Yuchen
Williamson Donald
Publication venue
Publication date: 26/09/2023
Field of study

In contemporary society, voice-controlled devices, such as smartphones and home assistants, have become pervasive due to their advanced capabilities and functionality. The always-on nature of their microphones offers users the convenience of readily accessing these devices. However, recent research and events have revealed that such voice-controlled devices are prone to various forms of malicious attacks, hence making it a growing concern for both users and researchers to safeguard against such attacks. Despite the numerous studies that have investigated adversarial attacks and privacy preservation for images, a conclusive study of this nature has not been conducted for the audio domain. Therefore, this paper aims to examine existing approaches for privacy-preserving and privacy-attacking strategies for audio and speech. To achieve this goal, we classify the attack and defense scenarios into several categories and provide detailed analysis of each approach. We also interpret the dissimilarities between the various approaches, highlight their contributions, and examine their limitations. Our investigation reveals that voice-controlled devices based on neural networks are inherently susceptible to specific types of attacks. Although it is possible to enhance the robustness of such models to certain forms of attack, more sophisticated approaches are required to comprehensively safeguard user privacy

arXiv.org e-Print Archive

Weighted-Sampling Audio Adversarial Example Attack

Author
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date
Field of study

Crossref