Search CORE

2 research outputs found

Design and Optimization of a Speech Recognition Front-End for Distant-Talking Control of a Music Playback Device

Author: Atkins Joshua
Giacobello Daniele
Pichevar Ramin
Wung Jason
Publication venue
Publication date: 05/05/2014
Field of study

This paper addresses the challenging scenario for the distant-talking control of a music playback device, a common portable speaker with four small loudspeakers in close proximity to one microphone. The user controls the device through voice, where the speech-to-music ratio can be as low as -30 dB during music playback. We propose a speech enhancement front-end that relies on known robust methods for echo cancellation, double-talk detection, and noise suppression, as well as a novel adaptive quasi-binary mask that is well suited for speech recognition. The optimization of the system is then formulated as a large scale nonlinear programming problem where the recognition rate is maximized and the optimal values for the system parameters are found through a genetic algorithm. We validate our methodology by testing over the TIMIT database for different music playback levels and noise types. Finally, we show that the proposed front-end allows a natural interaction with the device for limited-vocabulary voice commands

arXiv.org e-Print Archive

Trends and Perspectives for Signal Processing in Consumer Audio

Author: Atkins Joshua
Giacobello Daniele
Publication venue
Publication date: 19/05/2014
Field of study

The trend in media consumption towards streaming and portability offers new challenges and opportunities for signal processing in audio and acoustics. The most significant embodiment of this trend is that most music consumption now happens on-the-go which has recently led to an explosion in headphone sales and small portable speakers. In particular, premium headphones offer a gateway for a younger generation to experience high quality sound. Additionally, through technologies incorporating head-related transfer functions headphones can also offer unique new experiences in gaming, augmented reality, and surround sound listening. Home audio has also seen a transition to smaller sound systems in the form of sound bars. This speaker configuration offers many exciting challenges for surround sound reproduction which has traditionally used five speakers surrounding the listener. Furthermore, modern home entertainment systems offer more than just content delivery; users now expect wireless and connected smart devices with video conferencing, gaming, and other interactive capabilities. With this comes challenges for voice interaction at a distance and in demanding conditions, e.g., during content playback, and opportunities for new smart interactive experiences based on awareness of environment and user biometrics.Comment: IEEE Audio and Acoustic Signal Processing Technical Committee Newsletter, May 201

arXiv.org e-Print Archive