603 research outputs found
Recommended from our members
Adaptive Noise Reduction for Sound Event Detection Using Subband-Weighted NMF
Sound event detection in real-world environments suffers from the interference of non-stationary and time-varying noise. This paper presents an adaptive noise reduction method for sound event detection based on non-negative matrix factorization (NMF). First, a scheme for noise dictionary learning from the input noisy signal is employed by the technique of robust NMF, which supports adaptation to noise variations. The estimated noise dictionary is used to develop a supervised source separation framework in combination with a pre-trained event dictionary. Second, to improve the separation quality, we extend the basic NMF model to a weighted form, with the aim of varying the relative importance of the different components when separating a target sound event from noise. With properly designed weights, the separation process is forced to rely more on those dominant event components, whereas the noise gets greatly suppressed. The proposed method is evaluated on a dataset of the rare sound event detection task of the DCASE 2017 challenge, and achieves comparable results to the top-ranking system based on convolutional recurrent neural networks (CRNNs). The proposed weighted NMF method shows an excellent noise reduction ability, and achieves an improvement of an F-score by 5%, compared to the unweighted approach
Denoising sound signals in a bioinspired non-negative spectro-temporal domain
The representation of sound signals at the cochlea and auditory cortical level has been studied as an alternative to classical analysis methods. In this work, we put forward a recently proposed feature extraction method called approximate auditory cortical representation, based on an approximation to the statistics of discharge patterns at the primary auditory cortex. The approach here proposed estimates a non-negative sparse coding with a combined dictionary of atoms. These atoms represent the spectro-temporal receptive fields of the auditory cortical neurons, and are calculated from the auditory spectrograms of clean signal and noise. The denoising is carried out on noisy signals by the reconstruction of the signal discarding the atoms corresponding to the noise. Experiments are presented using synthetic (chirps) and real data (speech), in the presence of additive noise. For the evaluation of the new method and its variants, we used two objective measures: the perceptual evaluation of speech quality and the segmental signal-to-noise ratio. Results show that the proposed method improves the quality of the signals, mainly under severe degradation.Fil: MartÃnez, César Ernesto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Goddard, J.. Universidad Autónoma Metropolitana; MéxicoFil: Di Persia, Leandro Ezequiel. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Milone, Diego Humberto. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; ArgentinaFil: Rufiner, Hugo Leonardo. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Santa Fe. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional. Universidad Nacional del Litoral. Facultad de IngenierÃa y Ciencias HÃdricas. Instituto de Investigación en Señales, Sistemas e Inteligencia Computacional; Argentina. Universidad Nacional de Entre RÃos. Facultad de IngenierÃa; Argentin
Low Rank and Sparsity Analysis Applied to Speech Enhancement via Online Estimated Dictionary
In this letter, we propose an online estimated local dictionary based single-channel speech enhancement algorithm, which focuses on low-rank and sparse matrix decomposition. In the proposed algorithm, a noisy speech spectrogram can be decomposed into low-rank background noise components and an activation of the online speech dictionary, on which both low-rank and sparsity constraints are imposed. This decomposition takes the advantage of local estimated exemplar’s high expressiveness on speech components and also accommodates nonstationary background noise. The local dictionary can be obtained through estimating the speech presence probability (SPP) by applying expectation–maximal algorithm, in which a generalized Gamma prior for speech magnitude spectrum is used. The proposed algorithm is evaluated using signal-to-distortion ratio, and perceptual evaluation of speech quality. The results show that the proposed algorithm achieves significant improvements at various SNRs when compared to four other speech enhancement algorithms, including improved Karhunen–Loeve transform approach, SPP-based MMSE, nonnegative matrix factorization-based robust principal component analysis (RPCA), and RPCA
Adaptive Hidden Markov Noise Modelling for Speech Enhancement
A robust and reliable noise estimation algorithm is required in many speech enhancement
systems. The aim of this thesis is to propose and evaluate a robust noise estimation
algorithm for highly non-stationary noisy environments. In this work, we model the
non-stationary noise using a set of discrete states with each state representing a distinct
noise power spectrum. In this approach, the state sequence over time is conveniently
represented by a Hidden Markov Model (HMM).
In this thesis, we first present an online HMM re-estimation framework that models
time-varying noise using a Hidden Markov Model and tracks changes in noise characteristics
by a sequential model update procedure that tracks the noise characteristics
during the absence of speech. In addition the algorithm will when necessary create new
model states to represent novel noise spectra and will merge existing states that have similar
characteristics. We then extend our work in robust noise estimation during speech
activity by incorporating a speech model into our existing noise model. The noise characteristics
within each state are updated based on a speech presence probability which
is derived from a modified Minima controlled recursive averaging method.
We have demonstrated the effectiveness of our noise HMM in tracking both stationary
and highly non-stationary noise, and shown that it gives improved performance over
other conventional noise estimation methods when it is incorporated into a standard
speech enhancement algorithm
A new weighted NMF algorithm for missing data interpolation and its application to speech enhancement
In this paper we present a novel weighted NMF (WNMF) algorithm for interpolating missing data. The proposed approach has a computational cost equivalent to that of standard NMF and, additionally, has the flexibility to control the degree of interpolation in the missing data regions. Existing WNMF methods do not offer this capability and, thereby, tend to overestimate the values in the masked regions. By constraining the estimates of the missing-data regions, the proposed approach allows for a better trade-off in the interpolation. We further demonstrate the applicability of WNMF and missing data estimation to the problem of speech enhancement. In this preliminary work, we consider the improvement obtainable by applying the proposed method to ideal binary mask-based gain functions. The instrumental quality metrics (PESQ and SNR) clearly indicate the added benefit of the missing data interpolation, compared to the output of the ideal binary mask. This preliminary work opens up novel possibilities not only in the field of speech enhancement but also, more generally, in the field of missing data interpolation using NMF
Speech enhancement by perceptual adaptive wavelet de-noising
This thesis work summarizes and compares the existing wavelet de-noising methods. Most popular methods of wavelet transform, adaptive thresholding, and musical noise suppression have been analyzed theoretically and evaluated through Matlab simulation. Based on the above work, a new speech enhancement system using adaptive wavelet de-noising is proposed. Each step of the standard wavelet thresholding is improved by optimized adaptive algorithms. The Quantile based adaptive noise estimate and the posteriori SNR based threshold adjuster are compensatory to each other. The combination of them integrates the advantages of these two approaches and balances the effects of noise removal and speech preservation. In order to improve the final perceptual quality, an innovative musical noise analysis and smoothing algorithm and a Teager Energy Operator based silent segment smoothing module are also introduced into the system. The experimental results have demonstrated the capability of the proposed system in both stationary and non-stationary noise environments
- …