44 research outputs found
Codebook-based Bayesian speech enhancement for nonstationary environments
In this paper, we propose a Bayesian minimum mean squared error approach for the joint estimation of the short-term predictor parameters of speech and noise, from the noisy observation. We use trained codebooks of speech and noise linear predictive coefficients to model the a priori information required by the Bayesian scheme. In contrast to current Bayesian estimation approaches that consider the excitation variances as part of the a priori information, in the proposed method they are computed online for each short-time segment, based on the observation at hand. Consequently, the method performs well in nonstationary noise conditions. The resulting estimates of the speech and noise spectra can be used in a Wiener filter or any state-of-the-art speech enhancement system. We develop both memoryless (using information from the current frame alone) and memory-based (using information from the current and previous frames) estimators. Estimation of functions of the short-term predictor parameters is also addressed, in particular one that leads to the minimum mean squared error estimate of the clean speech signal. Experiments indicate that the scheme proposed in this paper performs significantly better than competing method
Two-channel speech denoising through minimum tracking
A blind two-channel interference reduction algorithm to suppress localised interferers in reverberant environments is presented. The algorithm requires neither knowledge of source positions nor a speech-free noise reference. The goal is to estimate the speech signal as observed at one of the microphones, without any additional filtering effects that are typical in convolutive blind source separation
Agaricales em áreas de Floresta Ombrófila Densa e plantações de Pinus no Estado de Santa Catarina, Brasil
Analysis-by-synthesis speech coding based on relaxed waveform-matching constraints
Electrical Engineering, Mathematics and Computer Scienc
Recommended from our members
Regularized linear prediction all-pole models
For many cases of voiced speech, linear prediction (LP) based all-pole spectral envelopes exhibit unnatural vocal tract transfer functions that underestimate the formant bandwidths. To obtain smoother contoured all-pole spectral envelopes, we employ a regularization measure which discourages nonsmooth behavior of the transfer function. In particular, we demonstrate how a simple regularization scheme can be incorporated into the LP framework without the need for iterative numerical optimization or spectral sampling. Our results indicate that regularized LP all-pole models can provide more accurate vocal tract transfer function modeling than conventional LP, particularly at the formants
Room Acoustical Parameter Estimation from Room Impulse Responses Using Deep Neural Networks
We describe a new method to estimate the geometry of a room and reflection coefficients given room impulse responses. The method utilizes convolutional neural networks to estimate the room geometry and multilayer perceptrons to estimate the reflection coefficients. The mean square error is used as the loss function. In contrast to existing methods, we do not require the knowledge of the relative positions of sources and receivers in the room. The method can be used with only a single RIR between one source and one receiver. For simulated environments, the proposed estimation method can achieve an average of 0.04 m accuracy for each dimension in room geometry estimation and 0.09 accuracy in reflection coefficients. For real-world environments, the room geometry estimation method achieves an accuracy of an average of 0.065 m for each dimension.</p
Derivation and Analysis of the Primal-Dual Method of Multipliers Based on Monotone Operator Theory
In this paper, we present a novel derivation of an existing algorithm for distributed optimization termed the primal-dual method of multipliers (PDMM). In contrast to its initial derivation, monotone operator theory is used to connect PDMM with other first-order methods such as Douglas-Rachford splitting and the alternating direction method of multipliers, thus, providing insight into its operation. In particular, we show how PDMM combines a lifted dual form in conjunction with Peaceman-Rachford splitting to facilitate distributed optimization in undirected networks. We additionally demonstrate sufficient conditions for primal convergence for strongly convex differentiable functions and strengthen this result for strongly convex functions with Lipschitz continuous gradients by introducing a primal geometric convergence bound.</p
Recommended from our members
Spectral Envelope Estimation and Regularization
A well-known problem with linear prediction is that its estimate of the spectral envelope often has sharp peaks for high-pitch speakers. These peaks are anomalies resulting from contamination of the spectral envelope by the spectral fine structure. We investigate the method of regularized linear prediction to find a better estimate of the spectral envelope and compare the method to the commonly used approach of bandwidth expansion. We present simulations over voiced frames of female speakers from the TIMIT database, where the envelope modeling accuracy is measured using a log spectral distortion measure. We also investigate the coding properties of the methods. The results indicate that the new regularized LP method is superior to bandwidth expansion, with an insignificant increase in computational complexit