4,892 research outputs found
Acoustic Space Learning for Sound Source Separation and Localization on Binaural Manifolds
In this paper we address the problems of modeling the acoustic space
generated by a full-spectrum sound source and of using the learned model for
the localization and separation of multiple sources that simultaneously emit
sparse-spectrum sounds. We lay theoretical and methodological grounds in order
to introduce the binaural manifold paradigm. We perform an in-depth study of
the latent low-dimensional structure of the high-dimensional interaural
spectral data, based on a corpus recorded with a human-like audiomotor robot
head. A non-linear dimensionality reduction technique is used to show that
these data lie on a two-dimensional (2D) smooth manifold parameterized by the
motor states of the listener, or equivalently, the sound source directions. We
propose a probabilistic piecewise affine mapping model (PPAM) specifically
designed to deal with high-dimensional data exhibiting an intrinsic piecewise
linear structure. We derive a closed-form expectation-maximization (EM)
procedure for estimating the model parameters, followed by Bayes inversion for
obtaining the full posterior density function of a sound source direction. We
extend this solution to deal with missing data and redundancy in real world
spectrograms, and hence for 2D localization of natural sound sources such as
speech. We further generalize the model to the challenging case of multiple
sound sources and we propose a variational EM framework. The associated
algorithm, referred to as variational EM for source separation and localization
(VESSL) yields a Bayesian estimation of the 2D locations and time-frequency
masks of all the sources. Comparisons of the proposed approach with several
existing methods reveal that the combination of acoustic-space learning with
Bayesian inference enables our method to outperform state-of-the-art methods.Comment: 19 pages, 9 figures, 3 table
Probabilistic Modeling Paradigms for Audio Source Separation
This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems
A robust sequential hypothesis testing method for brake squeal localisation
This contribution deals with the in situ detection and localisation of brake squeal in an automobile. As brake squeal is emitted from regions known a priori, i.e., near the wheels, the localisation is treated as a hypothesis testing problem. Distributed microphone arrays, situated under the automobile, are used to capture the directional properties of the sound field generated by a squealing brake. The spatial characteristics of the sampled sound field is then used to formulate the hypothesis tests. However, in contrast to standard hypothesis testing approaches of this kind, the propagation environment is complex and time-varying. Coupled with inaccuracies in the knowledge of the sensor and source positions as well as sensor gain mismatches, modelling the sound field is difficult and standard approaches fail in this case. A previously proposed approach implicitly tried to account for such incomplete system knowledge and was based on ad hoc likelihood formulations. The current paper builds upon this approach and proposes a second approach, based on more solid theoretical foundations, that can systematically account for the model uncertainties. Results from tests in a real setting show that the proposed approach is more consistent than the prior state-of-the-art. In both approaches, the tasks of detection and localisation are decoupled for complexity reasons. The localisation (hypothesis testing) is subject to a prior detection of brake squeal and identification of the squeal frequencies. The approaches used for the detection and identification of squeal frequencies are also presented. The paper, further, briefly addresses some practical issues related to array design and placement. (C) 2019 Author(s)
Neural Connectivity with Hidden Gaussian Graphical State-Model
The noninvasive procedures for neural connectivity are under questioning.
Theoretical models sustain that the electromagnetic field registered at
external sensors is elicited by currents at neural space. Nevertheless, what we
observe at the sensor space is a superposition of projected fields, from the
whole gray-matter. This is the reason for a major pitfall of noninvasive
Electrophysiology methods: distorted reconstruction of neural activity and its
connectivity or leakage. It has been proven that current methods produce
incorrect connectomes. Somewhat related to the incorrect connectivity
modelling, they disregard either Systems Theory and Bayesian Information
Theory. We introduce a new formalism that attains for it, Hidden Gaussian
Graphical State-Model (HIGGS). A neural Gaussian Graphical Model (GGM) hidden
by the observation equation of Magneto-encephalographic (MEEG) signals. HIGGS
is equivalent to a frequency domain Linear State Space Model (LSSM) but with
sparse connectivity prior. The mathematical contribution here is the theory for
high-dimensional and frequency-domain HIGGS solvers. We demonstrate that HIGGS
can attenuate the leakage effect in the most critical case: the distortion EEG
signal due to head volume conduction heterogeneities. Its application in EEG is
illustrated with retrieved connectivity patterns from human Steady State Visual
Evoked Potentials (SSVEP). We provide for the first time confirmatory evidence
for noninvasive procedures of neural connectivity: concurrent EEG and
Electrocorticography (ECoG) recordings on monkey. Open source packages are
freely available online, to reproduce the results presented in this paper and
to analyze external MEEG databases
- …