731 research outputs found

    EMD-based filtering (EMDF) of low-frequency noise for speech enhancement

    Get PDF
    An Empirical Mode Decomposition based filtering (EMDF) approach is presented as a post-processing stage for speech enhancement. This method is particularly effective in low frequency noise environments. Unlike previous EMD based denoising methods, this approach does not make the assumption that the contaminating noise signal is fractional Gaussian Noise. An adaptive method is developed to select the IMF index for separating the noise components from the speech based on the second-order IMF statistics. The low frequency noise components are then separated by a partial reconstruction from the IMFs. It is shown that the proposed EMDF technique is able to suppress residual noise from speech signals that were enhanced by the conventional optimallymodified log-spectral amplitude approach which uses a minimum statistics based noise estimate. A comparative performance study is included that demonstrates the effectiveness of the EMDF system in various noise environments, such as car interior noise, military vehicle noise and babble noise. In particular, improvements up to 10 dB are obtained in car noise environments. Listening tests were performed that confirm the results

    Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition

    Get PDF
    This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types

    Techniques for enhancing digital images

    Get PDF
    The images obtain from either research studies or optical instruments are often corrupted with noise. Image denoising involves the manipulation of image data to produce a visually high quality image. This thesis reviews the existing denoising algorithms and the filtering approaches available for enhancing images and/or data transmission. Spatial-domain and Transform-domain digital image filtering algorithms have been used in the past to suppress different noise models. The different noise models can be either additive or multiplicative. Selection of the denoising algorithm is application dependent. It is necessary to have knowledge about the noise present in the image so as to select the appropriated denoising algorithm. Noise models may include Gaussian noise, Salt and Pepper noise, Speckle noise and Brownian noise. The Wavelet Transform is similar to the Fourier transform with a completely different merit function. The main difference between Wavelet transform and Fourier transform is that, in the Wavelet Transform, Wavelets are localized in both time and frequency. In the standard Fourier Transform, Wavelets are only localized in frequency. Wavelet analysis consists of breaking up the signal into shifted and scales versions of the original (or mother) Wavelet. The Wiener Filter (mean squared estimation error) finds implementations as a LMS filter (least mean squares), RLS filter (recursive least squares), or Kalman filter. Quantitative measure (metrics) of the comparison of the denoising algorithms is provided by calculating the Peak Signal to Noise Ratio (PSNR), the Mean Square Error (MSE) value and the Mean Absolute Error (MAE) evaluation factors. A combination of metrics including the PSNR, MSE, and MAE are often required to clearly assess the model performance

    Multi-Modal Enhancement Techniques for Visibility Improvement of Digital Images

    Get PDF
    Image enhancement techniques for visibility improvement of 8-bit color digital images based on spatial domain, wavelet transform domain, and multiple image fusion approaches are investigated in this dissertation research. In the category of spatial domain approach, two enhancement algorithms are developed to deal with problems associated with images captured from scenes with high dynamic ranges. The first technique is based on an illuminance-reflectance (I-R) model of the scene irradiance. The dynamic range compression of the input image is achieved by a nonlinear transformation of the estimated illuminance based on a windowed inverse sigmoid transfer function. A single-scale neighborhood dependent contrast enhancement process is proposed to enhance the high frequency components of the illuminance, which compensates for the contrast degradation of the mid-tone frequency components caused by dynamic range compression. The intensity image obtained by integrating the enhanced illuminance and the extracted reflectance is then converted to a RGB color image through linear color restoration utilizing the color components of the original image. The second technique, named AINDANE, is a two step approach comprised of adaptive luminance enhancement and adaptive contrast enhancement. An image dependent nonlinear transfer function is designed for dynamic range compression and a multiscale image dependent neighborhood approach is developed for contrast enhancement. Real time processing of video streams is realized with the I-R model based technique due to its high speed processing capability while AINDANE produces higher quality enhanced images due to its multi-scale contrast enhancement property. Both the algorithms exhibit balanced luminance, contrast enhancement, higher robustness, and better color consistency when compared with conventional techniques. In the transform domain approach, wavelet transform based image denoising and contrast enhancement algorithms are developed. The denoising is treated as a maximum a posteriori (MAP) estimator problem; a Bivariate probability density function model is introduced to explore the interlevel dependency among the wavelet coefficients. In addition, an approximate solution to the MAP estimation problem is proposed to avoid the use of complex iterative computations to find a numerical solution. This relatively low complexity image denoising algorithm implemented with dual-tree complex wavelet transform (DT-CWT) produces high quality denoised images

    Discrete Wavelet Transform Based Cancelable Biometric System for Speaker Recognition

    Get PDF
    The biometric template characteristics and privacy conquest are challenging issues. To resolve such limitations, the cancelable biometric systems have been briefed. In this paper, the efficient cancelable biometric system based on the cryptosystem is introduced. It depends on permutation using a chaotic Baker map and substitution using masks in various transform domains. The proposed cancelable system features extraction phase is based on the Cepstral analysis from the encrypted speech signal in the time domain combined with the encrypted speech signal in the discrete wavelet transform (DWT). Then, the resultant features are applied to the artificial neural network for classification. Furthermore, wavelet denoising is used at the receiver side to enhance the proposed system. The cryptosystem provides a robust protection level of the speech template. This speech template can be replaced and recertified if it is breached. Our proposed system enables the generation of various templates from the same speech signal under the constraint of linkability between them. The simulation results confirmed that the proposed cancelable biometric system achieved higher a level of performance than traditional biometric systems, which achieved 97.5% recognition rate at low signal to noise ratio (SNR) of -25dB and 100% with -15dB and above

    Speech Enhancement with Adaptive Thresholding and Kalman Filtering

    Get PDF
    Speech enhancement has been extensively studied for many years and various speech enhance- ment methods have been developed during the past decades. One of the objectives of speech en- hancement is to provide high-quality speech communication in the presence of background noise and concurrent interference signals. In the process of speech communication, the clean speech sig- nal is inevitably corrupted by acoustic noise from the surrounding environment, transmission media, communication equipment, electrical noise, other speakers, and other sources of interference. These disturbances can significantly degrade the quality and intelligibility of the received speech signal. Therefore, it is of great interest to develop efficient speech enhancement techniques to recover the original speech from the noisy observation. In recent years, various techniques have been developed to tackle this problem, which can be classified into single channel and multi-channel enhancement approaches. Since single channel enhancement is easy to implement, it has been a significant field of research and various approaches have been developed. For example, spectral subtraction and Wiener filtering, are among the earliest single channel methods, which are based on estimation of the power spectrum of stationary noise. However, when the noise is non-stationary, or there exists music noise and ambient speech noise, the enhancement performance would degrade considerably. To overcome this disadvantage, this thesis focuses on single channel speech enhancement under adverse noise environment, especially the non-stationary noise environment. Recently, wavelet transform based methods have been widely used to reduce the undesired background noise. On the other hand, the Kalman filter (KF) methods offer competitive denoising results, especially in non-stationary environment. It has been used as a popular and powerful tool for speech enhancement during the past decades. In this regard, a single channel wavelet thresholding based Kalman filter (KF) algorithm is proposed for speech enhancement in this thesis. The wavelet packet (WP) transform is first applied to the noise corrupted speech on a frame-by-frame basis, which decomposes each frame into a number of subbands. A voice activity detector (VAD) is then designed to detect the voiced/unvoiced frames of the subband speech. Based on the VAD result, an adaptive thresholding scheme is applied to each subband speech followed by the WP based reconstruction to obtain the pre-enhanced speech. To achieve a further level of enhancement, an iterative Kalman filter (IKF) is used to process the pre-enhanced speech. The proposed adaptive thresholding iterative Kalman filtering (AT-IKF) method is evaluated and compared with some existing methods under various noise conditions in terms of segmental SNR and perceptual evaluation of speech quality (PESQ) as two well-known performance indexes. Firstly, we compare the proposed adaptive thresholding (AT) scheme with three other threshold- ing schemes: the non-linear universal thresholding (U-T), the non-linear wavelet packet transform thresholding (WPT-T) and the non-linear SURE thresholding (SURE-T). The experimental results show that the proposed AT scheme can significantly improve the segmental SNR and PESQ for all input SNRs compared with the other existing thresholding schemes. Secondly, extensive computer simulations are conducted to evaluate the proposed AT-IKF as opposed to the AT and the IKF as standalone speech enhancement methods. It is shown that the AT-IKF method still performs the best. Lastly, the proposed ATIKF method is compared with three representative and popular meth- ods: the improved spectral subtraction based speech enhancement algorithm (ISS), the improved Wiener filter based method (IWF) and the representative subband Kalman filter based algorithm (SIKF). Experimental results demonstrate the effectiveness of the proposed method as compared to some previous works both in terms of segmental SNR and PESQ
    • …
    corecore