341 research outputs found
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization
Conventional approaches to sound source localization require at least two
microphones. It is known, however, that people with unilateral hearing loss can
also localize sounds. Monaural localization is possible thanks to the
scattering by the head, though it hinges on learning the spectra of the various
sources. We take inspiration from this human ability to propose algorithms for
accurate sound source localization using a single microphone embedded in an
arbitrary scattering structure. The structure modifies the frequency response
of the microphone in a direction-dependent way giving each direction a
signature. While knowing those signatures is sufficient to localize sources of
white noise, localizing speech is much more challenging: it is an ill-posed
inverse problem which we regularize by prior knowledge in the form of learned
non-negative dictionaries. We demonstrate a monaural speech localization
algorithm based on non-negative matrix factorization that does not depend on
sophisticated, designed scatterers. In fact, we show experimental results with
ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures
we can accurately localize arbitrary speakers; that is, we do not need to learn
the dictionary for the particular speaker to be localized. Finally, we discuss
multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM
Transactions on Audio, Speech, and Language processing (TASLP
Sparsity in array processing: methods and performances
International audienceIn the last few years, we witnessed to an extraordinary and still growing development of sparse signal recovery in a wide number of applicative contexts such as communications, biomedicine, radar, microwave imaging, source localization, astronomy, seismology... In many realistic array processing applications, the sparsity nature underlying various signals/arrays has to be exploit in recovery algorithms to enhance their performances. In this special session, most recent results in estimators based on sparsity-‐promoting criteria is proposed
Blind Multilinear Identification
We discuss a technique that allows blind recovery of signals or blind
identification of mixtures in instances where such recovery or identification
were previously thought to be impossible: (i) closely located or highly
correlated sources in antenna array processing, (ii) highly correlated
spreading codes in CDMA radio communication, (iii) nearly dependent spectra in
fluorescent spectroscopy. This has important implications --- in the case of
antenna array processing, it allows for joint localization and extraction of
multiple sources from the measurement of a noisy mixture recorded on multiple
sensors in an entirely deterministic manner. In the case of CDMA, it allows the
possibility of having a number of users larger than the spreading gain. In the
case of fluorescent spectroscopy, it allows for detection of nearly identical
chemical constituents. The proposed technique involves the solution of a
bounded coherence low-rank multilinear approximation problem. We show that
bounded coherence allows us to establish existence and uniqueness of the
recovered solution. We will provide some statistical motivation for the
approximation problem and discuss greedy approximation bounds. To provide the
theoretical underpinnings for this technique, we develop a corresponding theory
of sparse separable decompositions of functions, including notions of rank and
nuclear norm that specialize to the usual ones for matrices and operators but
apply to also hypermatrices and tensors.Comment: 20 pages, to appear in IEEE Transactions on Information Theor
Sparsity-Cognizant Total Least-Squares for Perturbed Compressive Sampling
Solving linear regression problems based on the total least-squares (TLS)
criterion has well-documented merits in various applications, where
perturbations appear both in the data vector as well as in the regression
matrix. However, existing TLS approaches do not account for sparsity possibly
present in the unknown vector of regression coefficients. On the other hand,
sparsity is the key attribute exploited by modern compressive sampling and
variable selection approaches to linear regression, which include noise in the
data, but do not account for perturbations in the regression matrix. The
present paper fills this gap by formulating and solving TLS optimization
problems under sparsity constraints. Near-optimum and reduced-complexity
suboptimum sparse (S-) TLS algorithms are developed to address the perturbed
compressive sampling (and the related dictionary learning) challenge, when
there is a mismatch between the true and adopted bases over which the unknown
vector is sparse. The novel S-TLS schemes also allow for perturbations in the
regression matrix of the least-absolute selection and shrinkage selection
operator (Lasso), and endow TLS approaches with ability to cope with sparse,
under-determined "errors-in-variables" models. Interesting generalizations can
further exploit prior knowledge on the perturbations to obtain novel weighted
and structured S-TLS solvers. Analysis and simulations demonstrate the
practical impact of S-TLS in calibrating the mismatch effects of contemporary
grid-based approaches to cognitive radio sensing, and robust
direction-of-arrival estimation using antenna arrays.Comment: 30 pages, 10 figures, submitted to IEEE Transactions on Signal
Processin
Sparse Modeling of Grouped Line Spectra
This licentiate thesis focuses on clustered parametric models for estimation of line spectra, when the spectral content of a signal source is assumed to exhibit some form of grouping. Different from previous parametric approaches, which generally require explicit knowledge of the model orders, this thesis exploits sparse modeling, where the orders are implicitly chosen. For line spectra, the non-linear parametric model is approximated by a linear system, containing an overcomplete basis of candidate frequencies, called a dictionary, and a large set of linear response variables that selects and weights the components in the dictionary. Frequency estimates are obtained by solving a convex optimization program, where the sum of squared residuals is minimized. To discourage overfitting and to infer certain structure in the solution, different convex penalty functions are introduced into the optimization. The cost trade-off between fit and penalty is set by some user parameters, as to approximate the true number of spectral lines in the signal, which implies that the response variable will be sparse, i.e., have few non-zero elements. Thus, instead of explicit model orders, the orders are implicitly set by this trade-off. For grouped variables, the dictionary is customized, and appropriate convex penalties selected, so that the solution becomes group sparse, i.e., has few groups with non-zero variables. In an array of sensors, the specific time-delays and attenuations will depend on the source and sensor positions. By modeling this, one may estimate the location of a source. In this thesis, a novel joint location and grouped frequency estimator is proposed, which exploits sparse modeling for both spectral and spatial estimates, showing robustness against sources with overlapping frequency content. For audio signals, this thesis uses two different features for clustering. Pitch is a perceptual property of sound that may be described by the harmonic model, i.e., by a group of spectral lines at integer multiples of a fundamental frequency, which we estimate by exploiting a novel adaptive total variation penalty. The other feature, chroma, is a concept in musical theory, collecting pitches at powers of 2 from each other into groups. Using a chroma dictionary, together with appropriate group sparse penalties, we propose an automatic transcription of the chroma content of a signal
Acoustic DoA Estimation by One Unsophisticated Sensor
We show how introducing known scattering can be used in direction of arrival estimation by a single sensor. We first present an analysis of the geometry of the underlying measurement space and show how it enables localizing white sources. Then, we extend the solution to more challenging non-white sources like speech by including a source model and considering convex relaxations with group sparsity penalties. We conclude with numerical simulations using an unsophisticated sensing device to validate the theory
- …