Search CORE

2 research outputs found

Adaptive Multi-Class Audio Classification in Noisy In-Vehicle Environment

Author: Alsaadan Haitham
Eun Yongsoon
Won Myounggyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/03/2017
Field of study

With ever-increasing number of car-mounted electric devices and their complexity, audio classification is increasingly important for the automotive industry as a fundamental tool for human-device interactions. Existing approaches for audio classification, however, fall short as the unique and dynamic audio characteristics of in-vehicle environments are not appropriately taken into account. In this paper, we develop an audio classification system that classifies an audio stream into music, speech, speech+music, and noise, adaptably depending on driving environments including highway, local road, crowded city, and stopped vehicle. More than 420 minutes of audio data including various genres of music, speech, speech+music, and noise are collected from diverse driving environments. The results demonstrate that the proposed approach improves the average classification accuracy up to 166%, and 64% for speech, and speech+music, respectively, compared with a non-adaptive approach in our experimental settings

arXiv.org e-Print Archive

University of Memphis Digital Commons

Breaking Audio Captcha using Machine Learning/Deep Learning and Related Defense Mechanism

Author: Shekhar Heemany
Publication venue: SJSU ScholarWorks
Publication date: 22/05/2019
Field of study

CAPTCHA is a web-based authentication method used by websites to distinguish between humans (valid users) and bots(attackers). Audio captcha is an accessible captcha meant for the visually disabled section of users such as color-blind, blind, near-sighted users. In this project, I analyzed the security of audio captchas from attacks that employ machine learning and deep learning models. Audio captchas of varying lengths (5, 7 and 10) and varying background noise (no noise, medium noise or high noise) were analyzed. I found that audio captchas with no background noise or medium background noise were easily attacked with 99% - 100% accuracy. Whereas, audio captchas with high noise were relatively more secure with breaking accuracy of 85%. I also propose that adversarial example attacks can be used in favor of audio captcha, that is, adversarial example attacks can be used to defend audio captcha from attackers. I explored two adversarial examples attack algorithms: Basic Iterative Method (BIM) and DeepFool method to create new adversarial audio captcha. Finally, I analyzed the security of these newly created adversarial audio captcha by simulating Level I and Level II defense scenarios. Level I defense is a defense against pre- trained models that have never seen adversarial examples before. Whereas a Level II defense is a defense against models that have been re-trained on adversarial examples. My experiments show that Level I defense can prevent nearly 100% of attacks from pre-trained models. It also proves that Level II defense increases security of audio captcha by 57% to 67%. Real world scenarios such as multi-retries are also studied and related defense mechanism are suggested

SJSU ScholarWorks