731 research outputs found
EMD-based filtering (EMDF) of low-frequency noise for speech enhancement
An Empirical Mode Decomposition based filtering (EMDF) approach is presented as a post-processing stage for speech enhancement. This method is particularly effective in low frequency noise environments. Unlike previous EMD based denoising methods, this approach does not make the assumption that the contaminating noise signal is fractional Gaussian Noise. An adaptive method is developed to select the IMF index for separating the noise components from the speech based on the second-order IMF statistics. The low frequency noise components are then separated by a partial reconstruction from the IMFs. It is shown that the proposed EMDF technique is able to suppress residual noise from speech signals that were enhanced by the conventional optimallymodified log-spectral amplitude approach which uses a minimum statistics based noise estimate. A comparative performance study is included that demonstrates the effectiveness of the EMDF system in various noise environments, such as car interior noise, military vehicle noise and babble noise. In particular, improvements up to 10 dB are obtained in car noise environments. Listening tests were performed that confirm the results
Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition
This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types
Techniques for enhancing digital images
The images obtain from either research studies or optical instruments are
often corrupted with noise. Image denoising involves the manipulation of image
data to produce a visually high quality image. This thesis reviews the existing
denoising algorithms and the filtering approaches available for enhancing images
and/or data transmission.
Spatial-domain and Transform-domain digital image filtering algorithms
have been used in the past to suppress different noise models. The different noise
models can be either additive or multiplicative. Selection of the denoising algorithm
is application dependent. It is necessary to have knowledge about the noise present
in the image so as to select the appropriated denoising algorithm. Noise models
may include Gaussian noise, Salt and Pepper noise, Speckle noise and Brownian
noise. The Wavelet Transform is similar to the Fourier transform with a completely
different merit function. The main difference between Wavelet transform and
Fourier transform is that, in the Wavelet Transform, Wavelets are localized in both
time and frequency. In the standard Fourier Transform, Wavelets are only localized
in frequency. Wavelet analysis consists of breaking up the signal into shifted and
scales versions of the original (or mother) Wavelet. The Wiener Filter (mean
squared estimation error) finds implementations as a LMS filter (least mean
squares), RLS filter (recursive least squares), or Kalman filter.
Quantitative measure (metrics) of the comparison of the denoising algorithms
is provided by calculating the Peak Signal to Noise Ratio (PSNR), the Mean Square
Error (MSE) value and the Mean Absolute Error (MAE) evaluation factors. A
combination of metrics including the PSNR, MSE, and MAE are often required to
clearly assess the model performance
Multi-Modal Enhancement Techniques for Visibility Improvement of Digital Images
Image enhancement techniques for visibility improvement of 8-bit color digital images based on spatial domain, wavelet transform domain, and multiple image fusion approaches are investigated in this dissertation research.
In the category of spatial domain approach, two enhancement algorithms are developed to deal with problems associated with images captured from scenes with high dynamic ranges. The first technique is based on an illuminance-reflectance (I-R) model of the scene irradiance. The dynamic range compression of the input image is achieved by a nonlinear transformation of the estimated illuminance based on a windowed inverse sigmoid transfer function. A single-scale neighborhood dependent contrast enhancement process is proposed to enhance the high frequency components of the illuminance, which compensates for the contrast degradation of the mid-tone frequency components caused by dynamic range compression. The intensity image obtained by integrating the enhanced illuminance and the extracted reflectance is then converted to a RGB color image through linear color restoration utilizing the color components of the original image. The second technique, named AINDANE, is a two step approach comprised of adaptive luminance enhancement and adaptive contrast enhancement. An image dependent nonlinear transfer function is designed for dynamic range compression and a multiscale image dependent neighborhood approach is developed for contrast enhancement. Real time processing of video streams is realized with the I-R model based technique due to its high speed processing capability while AINDANE produces higher quality enhanced images due to its multi-scale contrast enhancement property. Both the algorithms exhibit balanced luminance, contrast enhancement, higher robustness, and better color consistency when compared with conventional techniques.
In the transform domain approach, wavelet transform based image denoising and contrast enhancement algorithms are developed. The denoising is treated as a maximum a posteriori (MAP) estimator problem; a Bivariate probability density function model is introduced to explore the interlevel dependency among the wavelet coefficients. In addition, an approximate solution to the MAP estimation problem is proposed to avoid the use of complex iterative computations to find a numerical solution. This relatively low complexity image denoising algorithm implemented with dual-tree complex wavelet transform (DT-CWT) produces high quality denoised images
Discrete Wavelet Transform Based Cancelable Biometric System for Speaker Recognition
The biometric template characteristics and privacy conquest are challenging issues. To resolve such limitations, the cancelable biometric systems have been briefed. In this paper, the efficient cancelable biometric system based on the cryptosystem is introduced. It depends on permutation using a chaotic Baker map and substitution using masks in various transform domains. The proposed cancelable system features extraction phase is based on the Cepstral analysis from the encrypted speech signal in the time domain combined with the encrypted speech signal in the discrete wavelet transform (DWT). Then, the resultant features are applied to the artificial neural network for classification. Furthermore, wavelet denoising is used at the receiver side to enhance the proposed system. The cryptosystem provides a robust protection level of the speech template. This speech template can be replaced and recertified if it is breached. Our proposed system enables the generation of various templates from the same speech signal under the constraint of linkability between them. The simulation results confirmed that the proposed cancelable biometric system achieved higher a level of performance than traditional biometric systems, which achieved 97.5% recognition rate at low signal to noise ratio (SNR) of -25dB and 100% with -15dB and above
Speech Enhancement with Adaptive Thresholding and Kalman Filtering
Speech enhancement has been extensively studied for many years and various speech enhance- ment methods have been developed during the past decades. One of the objectives of speech en- hancement is to provide high-quality speech communication in the presence of background noise and concurrent interference signals. In the process of speech communication, the clean speech sig- nal is inevitably corrupted by acoustic noise from the surrounding environment, transmission media, communication equipment, electrical noise, other speakers, and other sources of interference. These disturbances can significantly degrade the quality and intelligibility of the received speech signal. Therefore, it is of great interest to develop efficient speech enhancement techniques to recover the original speech from the noisy observation. In recent years, various techniques have been developed to tackle this problem, which can be classified into single channel and multi-channel enhancement approaches. Since single channel enhancement is easy to implement, it has been a significant field of research and various approaches have been developed. For example, spectral subtraction and Wiener filtering, are among the earliest single channel methods, which are based on estimation of the power spectrum of stationary noise. However, when the noise is non-stationary, or there exists music noise and ambient speech noise, the enhancement performance would degrade considerably. To overcome this disadvantage, this thesis focuses on single channel speech enhancement under adverse noise environment, especially the non-stationary noise environment.
Recently, wavelet transform based methods have been widely used to reduce the undesired background noise. On the other hand, the Kalman filter (KF) methods offer competitive denoising results, especially in non-stationary environment. It has been used as a popular and powerful tool for speech enhancement during the past decades. In this regard, a single channel wavelet thresholding based Kalman filter (KF) algorithm is proposed for speech enhancement in this thesis. The wavelet packet (WP) transform is first applied to the noise corrupted speech on a frame-by-frame basis, which decomposes each frame into a number of subbands. A voice activity detector (VAD) is then designed to detect the voiced/unvoiced frames of the subband speech. Based on the VAD result, an adaptive thresholding scheme is applied to each subband speech followed by the WP based reconstruction to obtain the pre-enhanced speech. To achieve a further level of enhancement, an iterative Kalman filter (IKF) is used to process the pre-enhanced speech.
The proposed adaptive thresholding iterative Kalman filtering (AT-IKF) method is evaluated and compared with some existing methods under various noise conditions in terms of segmental SNR and perceptual evaluation of speech quality (PESQ) as two well-known performance indexes. Firstly, we compare the proposed adaptive thresholding (AT) scheme with three other threshold- ing schemes: the non-linear universal thresholding (U-T), the non-linear wavelet packet transform thresholding (WPT-T) and the non-linear SURE thresholding (SURE-T). The experimental results show that the proposed AT scheme can significantly improve the segmental SNR and PESQ for all input SNRs compared with the other existing thresholding schemes. Secondly, extensive computer simulations are conducted to evaluate the proposed AT-IKF as opposed to the AT and the IKF as standalone speech enhancement methods. It is shown that the AT-IKF method still performs the best. Lastly, the proposed ATIKF method is compared with three representative and popular meth- ods: the improved spectral subtraction based speech enhancement algorithm (ISS), the improved Wiener filter based method (IWF) and the representative subband Kalman filter based algorithm (SIKF). Experimental results demonstrate the effectiveness of the proposed method as compared to some previous works both in terms of segmental SNR and PESQ
- …