591 research outputs found
Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings
We tackle the multi-party speech recovery problem through modeling the
acoustic of the reverberant chambers. Our approach exploits structured sparsity
models to perform room modeling and speech recovery. We propose a scheme for
characterizing the room acoustic from the unknown competing speech sources
relying on localization of the early images of the speakers by sparse
approximation of the spatial spectra of the virtual sources in a free-space
model. The images are then clustered exploiting the low-rank structure of the
spectro-temporal components belonging to each source. This enables us to
identify the early support of the room impulse response function and its unique
map to the room geometry. To further tackle the ambiguity of the reflection
ratios, we propose a novel formulation of the reverberation model and estimate
the absorption coefficients through a convex optimization exploiting joint
sparsity model formulated upon spatio-spectral sparsity of concurrent speech
representation. The acoustic parameters are then incorporated for separating
individual speech signals through either structured sparse recovery or inverse
filtering the acoustic channels. The experiments conducted on real data
recordings demonstrate the effectiveness of the proposed approach for
multi-party speech recovery and recognition.Comment: 31 page
PSD Estimation of Multiple Sound Sources in a Reverberant Room Using a Spherical Microphone Array
We propose an efficient method to estimate source power spectral densities
(PSDs) in a multi-source reverberant environment using a spherical microphone
array. The proposed method utilizes the spatial correlation between the
spherical harmonics (SH) coefficients of a sound field to estimate source PSDs.
The use of the spatial cross-correlation of the SH coefficients allows us to
employ the method in an environment with a higher number of sources compared to
conventional methods. Furthermore, the orthogonality property of the SH basis
functions saves the effort of designing specific beampatterns of a conventional
beamformer-based method. We evaluate the performance of the algorithm with
different number of sources in practical reverberant and non-reverberant rooms.
We also demonstrate an application of the method by separating source signals
using a conventional beamformer and a Wiener post-filter designed from the
estimated PSDs.Comment: Accepted for WASPAA 201
Sound Source Separation
This is the author's accepted pre-print of the article, first published as G. Evangelista, S. Marchand, M. D. Plumbley and E. Vincent. Sound source separation. In U. Zölzer (ed.), DAFX: Digital Audio Effects, 2nd edition, Chapter 14, pp. 551-588. John Wiley & Sons, March 2011. ISBN 9781119991298. DOI: 10.1002/9781119991298.ch14file: Proof:e\EvangelistaMarchandPlumbleyV11-sound.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:e\EvangelistaMarchandPlumbleyV11-sound.pdf:PDF owner: markp timestamp: 2011.04.2
Component separation methods for the Planck mission
The Planck satellite will map the full sky at nine frequencies from 30 to 857
GHz. The CMB intensity and polarization that are its prime targets are
contaminated by foreground emission. The goal of this paper is to compare
proposed methods for separating CMB from foregrounds based on their different
spectral and spatial characteristics, and to separate the foregrounds into
components of different physical origin. A component separation challenge has
been organized, based on a set of realistically complex simulations of sky
emission. Several methods including those based on internal template
subtraction, maximum entropy method, parametric method, spatial and harmonic
cross correlation methods, and independent component analysis have been tested.
Different methods proved to be effective in cleaning the CMB maps from
foreground contamination, in reconstructing maps of diffuse Galactic emissions,
and in detecting point sources and thermal Sunyaev-Zeldovich signals. The power
spectrum of the residuals is, on the largest scales, four orders of magnitude
lower than that of the input Galaxy power spectrum at the foreground minimum.
The CMB power spectrum was accurately recovered up to the sixth acoustic peak.
The point source detection limit reaches 100 mJy, and about 2300 clusters are
detected via the thermal SZ effect on two thirds of the sky. We have found that
no single method performs best for all scientific objectives. We foresee that
the final component separation pipeline for Planck will involve a combination
of methods and iterations between processing steps targeted at different
objectives such as diffuse component separation, spectral estimation and
compact source extraction.Comment: Matches version accepted by A&A. A version with high resolution
figures is available at http://people.sissa.it/~leach/compsepcomp.pd
Spatial dissection of a soundfield using spherical harmonic decomposition
A real-world soundfield is often contributed by multiple desired and undesired sound sources. The performance of many acoustic systems such as automatic speech recognition, audio surveillance, and teleconference relies on its ability to extract the desired sound components in such a mixed environment. The existing solutions to the above problem are constrained by various fundamental limitations and require to enforce different priors depending on the acoustic condition such as reverberation and spatial distribution of sound sources. With the growing emphasis and integration of audio applications in diverse technologies such as smart home and virtual reality appliances, it is imperative to advance the source separation technology in order to overcome the limitations of the traditional approaches.
To that end, we exploit the harmonic decomposition model to dissect a mixed soundfield into its underlying desired and undesired components based on source and signal characteristics. By analysing the spatial projection of a soundfield, we achieve multiple outcomes such as (i) soundfield separation with respect to distinct source regions, (ii) source separation in a mixed soundfield using modal coherence model, and (iii) direction of arrival (DOA) estimation of multiple overlapping sound sources through pattern recognition of the modal coherence of a soundfield.
We first employ an array of higher order microphones for soundfield separation in order to reduce hardware requirement and implementation complexity. Subsequently, we develop novel mathematical models for modal coherence of noisy and reverberant soundfields that facilitate convenient ways for estimating DOA and power spectral densities leading to robust source separation algorithms. The modal domain approach to the soundfield/source separation allows us to circumvent several practical limitations of the existing techniques and enhance the performance and robustness of the system. The proposed methods are presented with several practical applications and performance evaluations using simulated and real-life dataset
Sampling Sparse Signals on the Sphere: Algorithms and Applications
We propose a sampling scheme that can perfectly reconstruct a collection of
spikes on the sphere from samples of their lowpass-filtered observations.
Central to our algorithm is a generalization of the annihilating filter method,
a tool widely used in array signal processing and finite-rate-of-innovation
(FRI) sampling. The proposed algorithm can reconstruct spikes from
spatial samples. This sampling requirement improves over
previously known FRI sampling schemes on the sphere by a factor of four for
large . We showcase the versatility of the proposed algorithm by applying it
to three different problems: 1) sampling diffusion processes induced by
localized sources on the sphere, 2) shot noise removal, and 3) sound source
localization (SSL) by a spherical microphone array. In particular, we show how
SSL can be reformulated as a spherical sparse sampling problem.Comment: 14 pages, 8 figures, submitted to IEEE Transactions on Signal
Processin
- …