129 research outputs found
Raking the Cocktail Party
We present the concept of an acoustic rake receiver---a microphone beamformer
that uses echoes to improve the noise and interference suppression. The rake
idea is well-known in wireless communications; it involves constructively
combining different multipath components that arrive at the receiver antennas.
Unlike spread-spectrum signals used in wireless communications, speech signals
are not orthogonal to their shifts. Therefore, we focus on the spatial
structure, rather than temporal. Instead of explicitly estimating the channel,
we create correspondences between early echoes in time and image sources in
space. These multiple sources of the desired and the interfering signal offer
additional spatial diversity that we can exploit in the beamformer design.
We present several "intuitive" and optimal formulations of acoustic rake
receivers, and show theoretically and numerically that the rake formulation of
the maximum signal-to-interference-and-noise beamformer offers significant
performance boosts in terms of noise and interference suppression. Beyond
signal-to-noise ratio, we observe gains in terms of the \emph{perceptual
evaluation of speech quality} (PESQ) metric for the speech quality. We
accompany the paper by the complete simulation and processing chain written in
Python. The code and the sound samples are available online at
\url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on
Selected Topics in Signal Processing (Special Issue on Spatial Audio
IANS: Intelligibility-aware Null-steering Beamforming for Dual-Microphone Arrays
Beamforming techniques are popular in speech-related applications due to
their effective spatial filtering capabilities. Nonetheless, conventional
beamforming techniques generally depend heavily on either the target's
direction-of-arrival (DOA), relative transfer function (RTF) or covariance
matrix. This paper presents a new approach, the intelligibility-aware
null-steering (IANS) beamforming framework, which uses the STOI-Net
intelligibility prediction model to improve speech intelligibility without
prior knowledge of the speech signal parameters mentioned earlier. The IANS
framework combines a null-steering beamformer (NSBF) to generate a set of
beamformed outputs, and STOI-Net, to determine the optimal result. Experimental
results indicate that IANS can produce intelligibility-enhanced signals using a
small dual-microphone array. The results are comparable to those obtained by
null-steering beamformers with given knowledge of DOAs.Comment: Preprint submitted to IEEE MLSP 202
- …