71 research outputs found
Acoustic Impulse Responses for Wearable Audio Devices
We present an open-access dataset of over 8000 acoustic impulse from 160
microphones spread across the body and affixed to wearable accessories. The
data can be used to evaluate audio capture and array processing systems using
wearable devices such as hearing aids, headphones, eyeglasses, jewelry, and
clothing. We analyze the acoustic transfer functions of different parts of the
body, measure the effects of clothing worn over microphones, compare
measurements from a live human subject to those from a mannequin, and simulate
the noise-reduction performance of several beamformers. The results suggest
that arrays of microphones spread across the body are more effective than those
confined to a single device.Comment: To appear at ICASSP 201
RTF-Based Binaural MVDR Beamformer Exploiting an External Microphone in a Diffuse Noise Field
Besides suppressing all undesired sound sources, an important objective of a
binaural noise reduction algorithm for hearing devices is the preservation of
the binaural cues, aiming at preserving the spatial perception of the acoustic
scene. A well-known binaural noise reduction algorithm is the binaural minimum
variance distortionless response beamformer, which can be steered using the
relative transfer function (RTF) vector of the desired source, relating the
acoustic transfer functions between the desired source and all microphones to a
reference microphone. In this paper, we propose a computationally efficient
method to estimate the RTF vector in a diffuse noise field, requiring an
additional microphone that is spatially separated from the head-mounted
microphones. Assuming that the spatial coherence between the noise components
in the head-mounted microphone signals and the additional microphone signal is
zero, we show that an unbiased estimate of the RTF vector can be obtained.
Based on real-world recordings, experimental results for several reverberation
times show that the proposed RTF estimator outperforms the widely used RTF
estimator based on covariance whitening and a simple biased RTF estimator in
terms of noise reduction and binaural cue preservation performance.Comment: Accepted at ITG Conference on Speech Communication 201
OBJECTIVE AND SUBJECTIVE EVALUATION OF DEREVERBERATION ALGORITHMS
Reverberation significantly impacts the quality and intelligibility of speech. Several dereverberation algorithms have been proposed in the literature to combat this problem. A majority of these algorithms utilize a single channel and are developed for monaural applications, and as such do not preserve the cues necessary for sound localization. This thesis describes a blind two-channel dereverberation technique that improves the quality of speech corrupted by reverberation while preserving cues that affect localization. The method is based by combining a short term (2ms) and long term (20ms) weighting function of the linear prediction (LP) residual of the input signal. The developed and other dereverberation algorithms are evaluated objectively and subjectively in terms of sound quality and localization accuracy. The binaural adaptation provides a significant increase in sound quality while removing the loss in localization ability found in the bilateral implementation
Audio source separation into the wild
International audienceThis review chapter is dedicated to multichannel audio source separation in real-life environment. We explore some of the major achievements in the field and discuss some of the remaining challenges. We will explore several important practical scenarios, e.g. moving sources and/or microphones, varying number of sources and sensors, high reverberation levels, spatially diffuse sources, and synchronization problems. Several applications such as smart assistants, cellular phones, hearing aids and robots, will be discussed. Our perspectives on the future of the field will be given as concluding remarks of this chapter
Binaural Speech Enhancement Using STOI-Optimal Masks
STOI-optimal masking has been previously proposed and developed for
single-channel speech enhancement. In this paper, we consider the extension to
the task of binaural speech enhancement in which spatial information is known
to be important to speech understanding and therefore should be preserved by
the enhancement processing. Masks are estimated for each of the binaural
channels individually and a `better-ear listening' mask is computed by choosing
the maximum of the two masks. The estimated mask is used to supply probability
information about the speech presence in each time-frequency bin to an
Optimally-modified Log Spectral Amplitude (OM-LSA) enhancer. We show that using
the proposed method for binaural signals with a directional noise not only
improves the SNR of the noisy signal but also preserves the binaural cues and
intelligibility.Comment: Accepted at IWAENC 202
- …