55 research outputs found

    Improved speech presence probability estimation based on wavelet denoising

    Get PDF
    A reliable estimator for speech presence probability (SPP) can significantly improve the performance of many speech enhancement algorithms. Previous work showed that a good SPP estimator can be obtained by using a smooth a-posteriori signal to noise ratio (SNR) function, which can be achieved by reducing the noise variance when estimating the speech power spectrum. In this paper, a wavelet based denoising algorithm is proposed for such purpose. We first apply the wavelet transform to the periodogram of a noisy speech signal to generate an oracle for indicating the locations of the noise floor in the periodogram. We then make use of that oracle to selectively remove the wavelet coefficients of the noise floor in the log multitaper spectrum (MTS) of the noisy speech. The remaining wavelet coefficients are then used to reconstruct a denoised MTS and in turn generate a smooth a-posteriori SNR function. Simulation results show that the new SPP estimator outperforms the traditional approaches and enables a significantly improvement in the quality and intelligibility of the enhanced speeches. © 2012 IEEE.published_or_final_versio

    Speech Signal Enhancement through Adaptive Wavelet Thresholding

    Get PDF
    This paper demonstrates the application of the Bionic Wavelet Transform (BWT), an adaptive wavelet transform derived from a non-linear auditory model of the cochlea, to the task of speech signal enhancement. Results, measured objectively by Signal-to-Noise ratio (SNR) and Segmental SNR (SSNR) and subjectively by Mean Opinion Score (MOS), are given for additive white Gaussian noise as well as four different types of realistic noise environments. Enhancement is accomplished through the use of thresholding on the adapted BWT coefficients, and the results are compared to a variety of speech enhancement techniques, including Ephraim Malah filtering, iterative Wiener filtering, and spectral subtraction, as well as to wavelet denoising based on a perceptually scaled wavelet packet transform decomposition. Overall results indicate that SNR and SSNR improvements for the proposed approach are comparable to those of the Ephraim Malah filter, with BWT enhancement giving the best results of all methods for the noisiest (−10 db and −5 db input SNR) conditions. Subjective measurements using MOS surveys across a variety of 0 db SNR noise conditions indicate enhancement quality competitive with but still lower than results for Ephraim Malah filtering and iterative Wiener filtering, but higher than the perceptually scaled wavelet method

    Statistical signal processing for echo signals from ultrasound linear and nonlinear scatterers

    Get PDF

    Signal processing techniques for the enhancement of marine seismic data

    Get PDF
    This thesis presents several signal processing techniques applied to the enhancement of marine seismic data. Marine seismic exploration provides an image of the Earth's subsurface from reflected seismic waves. Because the recorded signals are contaminated by various sources of noise, minimizing their effects with new attenuation techniques is necessary. A statistical analysis of background noise is conducted using Thomson’s multitaper spectral estimator and Parzen's amplitude density estimator. The results provide a statistical characterization of the noise which we use for the derivation of signal enhancement algorithms. Firstly, we focus on single-azimuth stacking methodologies and propose novel stacking schemes using either enhanced weighted sums or a Kalman filter. It is demonstrated that the enhanced methods yield superior results by their ability to exhibit cleaner and better defined reflected events as well as a larger number of reflections in deep waters. A comparison of the proposed stacking methods with existing ones is also discussed. We then address the problem of random noise attenuation and present an innovative application of sparse code shrinkage and independent component analysis. Sparse code shrinkage is a valuable method when a noise-free realization of the data is generated to provide data-driven shrinkages. Several models of distribution are investigated, but the normal inverse Gaussian density yields the best results. Other acceptable choices of density are discussed as well. Finally, we consider the attenuation of flow-generated nonstationary coherent noise and seismic interference noise. We suggest a multiple-input adaptive noise canceller that utilizes a normalized least mean squares alg orithm with a variable normalized step size derived as a function of instantaneous frequency. This filter attenuates the coherent noise successfully when used either by itself or in combination with a time-frequency median filter, depending on the noise spectrum and repartition along the data. Its application to seismic interference attenuation is also discussed

    Wavelet Packet Transform based Speech Enhancement via Two-Dimensional SPP Estimator with Generalized Gamma Priors

    Get PDF
    Despite various speech enhancement techniques have been developed for different applications, existing methods are limited in noisy environments with high ambient noise levels. Speech presence probability (SPP) estimation is a speech enhancement technique to reduce speech distortions, especially in low signal-to-noise ratios (SNRs) scenario. In this paper, we propose a new two-dimensional (2D) Teager-energyoperators (TEOs) improved SPP estimator for speech enhancement in time-frequency (T-F) domain. Wavelet packet transform (WPT) as a multiband decomposition technique is used to concentrate the energy distribution of speech components. A minimum mean-square error (MMSE) estimator is obtained based on the generalized gamma distribution speech model in WPT domain. In addition, the speech samples corrupted by environment and occupational noises (i.e., machine shop, factory and station) at different input SNRs are used to validate the proposed algorithm. Results suggest that the proposed method achieves a significant enhancement on perceptual quality, compared with four conventional speech enhancement algorithms (i.e., MMSE-84, MMSE-04, Wiener-96, and BTW)
    corecore