47 research outputs found
Beyond Moore-Penrose Part II: The Sparse Pseudoinverse
This is the second part of a two-paper series on generalized inverses that
minimize matrix norms. In Part II we focus on generalized inverses that are
minimizers of entrywise p norms whose main representative is the sparse
pseudoinverse for . We are motivated by the idea to replace the
Moore-Penrose pseudoinverse by a sparser generalized inverse which is in some
sense well-behaved. Sparsity implies that it is faster to apply the resulting
matrix; well-behavedness would imply that we do not lose much in stability with
respect to the least-squares performance of the MPP. We first address questions
of uniqueness and non-zero count of (putative) sparse pseu-doinverses. We show
that a sparse pseudoinverse is generically unique, and that it indeed reaches
optimal sparsity for almost all matrices. We then turn to proving our main
stability result: finite-size concentration bounds for the Frobenius norm of
p-minimal inverses for \le\le. Our proof is based on tools from
convex analysis and random matrix theory, in particular the recently developed
convex Gaussian min-max theorem. Along the way we prove several results about
sparse representations and convex programming that were known folklore, but of
which we could find no proof
Raking the Cocktail Party
We present the concept of an acoustic rake receiver---a microphone beamformer
that uses echoes to improve the noise and interference suppression. The rake
idea is well-known in wireless communications; it involves constructively
combining different multipath components that arrive at the receiver antennas.
Unlike spread-spectrum signals used in wireless communications, speech signals
are not orthogonal to their shifts. Therefore, we focus on the spatial
structure, rather than temporal. Instead of explicitly estimating the channel,
we create correspondences between early echoes in time and image sources in
space. These multiple sources of the desired and the interfering signal offer
additional spatial diversity that we can exploit in the beamformer design.
We present several "intuitive" and optimal formulations of acoustic rake
receivers, and show theoretically and numerically that the rake formulation of
the maximum signal-to-interference-and-noise beamformer offers significant
performance boosts in terms of noise and interference suppression. Beyond
signal-to-noise ratio, we observe gains in terms of the \emph{perceptual
evaluation of speech quality} (PESQ) metric for the speech quality. We
accompany the paper by the complete simulation and processing chain written in
Python. The code and the sound samples are available online at
\url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on
Selected Topics in Signal Processing (Special Issue on Spatial Audio
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization
Conventional approaches to sound source localization require at least two
microphones. It is known, however, that people with unilateral hearing loss can
also localize sounds. Monaural localization is possible thanks to the
scattering by the head, though it hinges on learning the spectra of the various
sources. We take inspiration from this human ability to propose algorithms for
accurate sound source localization using a single microphone embedded in an
arbitrary scattering structure. The structure modifies the frequency response
of the microphone in a direction-dependent way giving each direction a
signature. While knowing those signatures is sufficient to localize sources of
white noise, localizing speech is much more challenging: it is an ill-posed
inverse problem which we regularize by prior knowledge in the form of learned
non-negative dictionaries. We demonstrate a monaural speech localization
algorithm based on non-negative matrix factorization that does not depend on
sophisticated, designed scatterers. In fact, we show experimental results with
ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures
we can accurately localize arbitrary speakers; that is, we do not need to learn
the dictionary for the particular speaker to be localized. Finally, we discuss
multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM
Transactions on Audio, Speech, and Language processing (TASLP
Omnidirectional Bats, Point-to-Plane Distances, and the Price of Uniqueness
We study simultaneous localization and mapping with a device that uses
reflections to measure its distance from walls. Such a device can be realized
acoustically with a synchronized collocated source and receiver; it behaves
like a bat with no capacity for directional hearing or vocalizing. In this
paper we generalize our previous work in 2D, and show that the 3D case is not
just a simple extension, but rather a fundamentally different inverse problem.
While generically the 2D problem has a unique solution, in 3D uniqueness is
always absent in rooms with fewer than nine walls. In addition to the complete
characterization of ambiguities which arise due to this non-uniqueness, we
propose a robust solution for inexact measurements similar to analogous results
for Euclidean Distance Matrices. Our theoretical results have important
consequences for the design of collocated range-only SLAM systems, and we
support them with an array of computer experiments.Comment: 5 pages, 8 figures, submitted to ICASSP 201
Pyroomacoustics: A Python package for audio room simulations and array processing algorithms
We present pyroomacoustics, a software package aimed at the rapid development
and testing of audio array processing algorithms. The content of the package
can be divided into three main components: an intuitive Python object-oriented
interface to quickly construct different simulation scenarios involving
multiple sound sources and microphones in 2D and 3D rooms; a fast C
implementation of the image source model for general polyhedral rooms to
efficiently generate room impulse responses and simulate the propagation
between sources and receivers; and finally, reference implementations of
popular algorithms for beamforming, direction finding, and adaptive filtering.
Together, they form a package with the potential to speed up the time to market
of new algorithms by significantly reducing the implementation overhead in the
performance evaluation step.Comment: 5 pages, 5 figures, describes a software packag