Search CORE

47 research outputs found

Beyond Moore-Penrose Part II: The Sparse Pseudoinverse

Author: Dokmanić Ivan
Gribonval Rémi
Publication venue
Publication date: 13/07/2017
Field of study

This is the second part of a two-paper series on generalized inverses that minimize matrix norms. In Part II we focus on generalized inverses that are minimizers of entrywise p norms whose main representative is the sparse pseudoinverse for

p = 1

. We are motivated by the idea to replace the Moore-Penrose pseudoinverse by a sparser generalized inverse which is in some sense well-behaved. Sparsity implies that it is faster to apply the resulting matrix; well-behavedness would imply that we do not lose much in stability with respect to the least-squares performance of the MPP. We first address questions of uniqueness and non-zero count of (putative) sparse pseu-doinverses. We show that a sparse pseudoinverse is generically unique, and that it indeed reaches optimal sparsity for almost all matrices. We then turn to proving our main stability result: finite-size concentration bounds for the Frobenius norm of p-minimal inverses for

1

\le

p

\le

2

. Our proof is based on tools from convex analysis and random matrix theory, in particular the recently developed convex Gaussian min-max theorem. Along the way we prove several results about sparse representations and convex programming that were known folklore, but of which we could find no proof

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

Raking the Cocktail Party

Author: Dokmanić Ivan
Scheibler Robin
Vetterli Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/07/2014
Field of study

We present the concept of an acoustic rake receiver---a microphone beamformer that uses echoes to improve the noise and interference suppression. The rake idea is well-known in wireless communications; it involves constructively combining different multipath components that arrive at the receiver antennas. Unlike spread-spectrum signals used in wireless communications, speech signals are not orthogonal to their shifts. Therefore, we focus on the spatial structure, rather than temporal. Instead of explicitly estimating the channel, we create correspondences between early echoes in time and image sources in space. These multiple sources of the desired and the interfering signal offer additional spatial diversity that we can exploit in the beamformer design. We present several "intuitive" and optimal formulations of acoustic rake receivers, and show theoretically and numerically that the rake formulation of the maximum signal-to-interference-and-noise beamformer offers significant performance boosts in terms of noise and interference suppression. Beyond signal-to-noise ratio, we observe gains in terms of the \emph{perceptual evaluation of speech quality} (PESQ) metric for the speech quality. We accompany the paper by the complete simulation and processing chain written in Python. The code and the sound samples are available online at \url{http://lcav.github.io/AcousticRakeReceiver/}.Comment: 12 pages, 11 figures, Accepted for publication in IEEE Journal on Selected Topics in Signal Processing (Special Issue on Spatial Audio

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization

Author: Badawy Dalia El
Dokmanić Ivan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/07/2018
Field of study

Conventional approaches to sound source localization require at least two microphones. It is known, however, that people with unilateral hearing loss can also localize sounds. Monaural localization is possible thanks to the scattering by the head, though it hinges on learning the spectra of the various sources. We take inspiration from this human ability to propose algorithms for accurate sound source localization using a single microphone embedded in an arbitrary scattering structure. The structure modifies the frequency response of the microphone in a direction-dependent way giving each direction a signature. While knowing those signatures is sufficient to localize sources of white noise, localizing speech is much more challenging: it is an ill-posed inverse problem which we regularize by prior knowledge in the form of learned non-negative dictionaries. We demonstrate a monaural speech localization algorithm based on non-negative matrix factorization that does not depend on sophisticated, designed scatterers. In fact, we show experimental results with ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures we can accurately localize arbitrary speakers; that is, we do not need to learn the dictionary for the particular speaker to be localized. Finally, we discuss multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language processing (TASLP

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Omnidirectional Bats, Point-to-Plane Distances, and the Price of Uniqueness

Author: Dokmanić Ivan
Kreković Miranda
Vetterli Martin
Publication venue
Publication date: 18/09/2016
Field of study

We study simultaneous localization and mapping with a device that uses reflections to measure its distance from walls. Such a device can be realized acoustically with a synchronized collocated source and receiver; it behaves like a bat with no capacity for directional hearing or vocalizing. In this paper we generalize our previous work in 2D, and show that the 3D case is not just a simple extension, but rather a fundamentally different inverse problem. While generically the 2D problem has a unique solution, in 3D uniqueness is always absent in rooms with fewer than nine walls. In addition to the complete characterization of ambiguities which arise due to this non-uniqueness, we propose a robust solution for inexact measurements similar to analogous results for Euclidean Distance Matrices. Our theoretical results have important consequences for the design of collocated range-only SLAM systems, and we support them with an array of computer experiments.Comment: 5 pages, 8 figures, submitted to ICASSP 201

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Pyroomacoustics: A Python package for audio room simulations and array processing algorithms

Author: Bezzam Eric
Dokmanić Ivan
Scheibler Robin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/10/2017
Field of study

We present pyroomacoustics, a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the package can be divided into three main components: an intuitive Python object-oriented interface to quickly construct different simulation scenarios involving multiple sound sources and microphones in 2D and 3D rooms; a fast C implementation of the image source model for general polyhedral rooms to efficiently generate room impulse responses and simulate the propagation between sources and receivers; and finally, reference implementations of popular algorithms for beamforming, direction finding, and adaptive filtering. Together, they form a package with the potential to speed up the time to market of new algorithms by significantly reducing the implementation overhead in the performance evaluation step.Comment: 5 pages, 5 figures, describes a software packag

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref