436 research outputs found
Convolutive Blind Source Separation Methods
In this chapter, we provide an overview of existing algorithms for blind source separation of convolutive audio mixtures. We provide a taxonomy, wherein many of the existing algorithms can be organized, and we present published results from those algorithms that have been applied to real-world audio separation tasks
Robust equalization of multichannel acoustic systems
In most real-world acoustical scenarios, speech signals captured by distant microphones from a source are reverberated due to multipath propagation, and the reverberation may impair speech intelligibility. Speech dereverberation can be achieved
by equalizing the channels from the source to microphones. Equalization systems can
be computed using estimates of multichannel acoustic impulse responses. However,
the estimates obtained from system identification always include errors; the fact that
an equalization system is able to equalize the estimated multichannel acoustic system does not mean that it is able to equalize the true system. The objective of this
thesis is to propose and investigate robust equalization methods for multichannel
acoustic systems in the presence of system identification errors.
Equalization systems can be computed using the multiple-input/output inverse theorem or multichannel least-squares method. However, equalization systems
obtained from these methods are very sensitive to system identification errors. A
study of the multichannel least-squares method with respect to two classes of characteristic channel zeros is conducted. Accordingly, a relaxed multichannel least-
squares method is proposed. Channel shortening in connection with the multiple-
input/output inverse theorem and the relaxed multichannel least-squares method is
discussed.
Two algorithms taking into account the system identification errors are developed. Firstly, an optimally-stopped weighted conjugate gradient algorithm is
proposed. A conjugate gradient iterative method is employed to compute the equalization system. The iteration process is stopped optimally with respect to system identification errors. Secondly, a system-identification-error-robust equalization
method exploring the use of error models is presented, which incorporates system
identification error models in the weighted multichannel least-squares formulation
Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling
We propose FSB-LSTM, a novel long short-term memory (LSTM) based architecture
that integrates full- and sub-band (FSB) modeling, for single- and
multi-channel speech enhancement in the short-time Fourier transform (STFT)
domain. The model maintains an information highway to flow an over-complete
input representation through multiple FSB-LSTM modules. Each FSB-LSTM module
consists of a full-band block to model spectro-temporal patterns at all
frequencies and a sub-band block to model patterns within each sub-band, where
each of the two blocks takes a down-sampled representation as input and returns
an up-sampled discriminative representation to be added to the block input via
a residual connection. The model is designed to have a low algorithmic
complexity, a small run-time buffer and a very low algorithmic latency, at the
same time producing a strong enhancement performance on a noisy-reverberant
speech enhancement task even if the hop size is as low as ms.Comment: in ICASSP 202
Probabilistic Modeling Paradigms for Audio Source Separation
This is the author's final version of the article, first published as E. Vincent, M. G. Jafari, S. A. Abdallah, M. D. Plumbley, M. E. Davies. Probabilistic Modeling Paradigms for Audio Source Separation. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 7, pp. 162-185. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch007file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04file: VincentJafariAbdallahPD11-probabilistic.pdf:v\VincentJafariAbdallahPD11-probabilistic.pdf:PDF owner: markp timestamp: 2011.02.04Most sound scenes result from the superposition of several sources, which can be separately perceived and analyzed by human listeners. Source separation aims to provide machine listeners with similar skills by extracting the sounds of individual sources from a given scene. Existing separation systems operate either by emulating the human auditory system or by inferring the parameters of probabilistic sound models. In this chapter, the authors focus on the latter approach and provide a joint overview of established and recent models, including independent component analysis, local time-frequency models and spectral template-based models. They show that most models are instances of one of the following two general paradigms: linear modeling or variance modeling. They compare the merits of either paradigm and report objective performance figures. They also,conclude by discussing promising combinations of probabilistic priors and inference algorithms that could form the basis of future state-of-the-art systems
DAMAS Processing for a Phased Array Study in the NASA Langley Jet Noise Laboratory
A jet noise measurement study was conducted using a phased microphone array system for a range of jet nozzle configurations and flow conditions. The test effort included convergent and convergent/divergent single flow nozzles, as well as conventional and chevron dual-flow core and fan configurations. Cold jets were tested with and without wind tunnel co-flow, whereas, hot jets were tested only with co-flow. The intent of the measurement effort was to allow evaluation of new phased array technologies for their ability to separate and quantify distributions of jet noise sources. In the present paper, the array post-processing method focused upon is DAMAS (Deconvolution Approach for the Mapping of Acoustic Sources) for the quantitative determination of spatial distributions of noise sources. Jet noise is highly complex with stationary and convecting noise sources, convecting flows that are the sources themselves, and shock-related and screech noise for supersonic flow. The analysis presented in this paper addresses some processing details with DAMAS, for the array positioned at 90 (normal) to the jet. The paper demonstrates the applicability of DAMAS and how it indicates when strong coherence is present. Also, a new approach to calibrating the array focus and position is introduced and demonstrated
- …