168 research outputs found
Speech Enhancement with Adaptive Thresholding and Kalman Filtering
Speech enhancement has been extensively studied for many years and various speech enhance- ment methods have been developed during the past decades. One of the objectives of speech en- hancement is to provide high-quality speech communication in the presence of background noise and concurrent interference signals. In the process of speech communication, the clean speech sig- nal is inevitably corrupted by acoustic noise from the surrounding environment, transmission media, communication equipment, electrical noise, other speakers, and other sources of interference. These disturbances can significantly degrade the quality and intelligibility of the received speech signal. Therefore, it is of great interest to develop efficient speech enhancement techniques to recover the original speech from the noisy observation. In recent years, various techniques have been developed to tackle this problem, which can be classified into single channel and multi-channel enhancement approaches. Since single channel enhancement is easy to implement, it has been a significant field of research and various approaches have been developed. For example, spectral subtraction and Wiener filtering, are among the earliest single channel methods, which are based on estimation of the power spectrum of stationary noise. However, when the noise is non-stationary, or there exists music noise and ambient speech noise, the enhancement performance would degrade considerably. To overcome this disadvantage, this thesis focuses on single channel speech enhancement under adverse noise environment, especially the non-stationary noise environment.
Recently, wavelet transform based methods have been widely used to reduce the undesired background noise. On the other hand, the Kalman filter (KF) methods offer competitive denoising results, especially in non-stationary environment. It has been used as a popular and powerful tool for speech enhancement during the past decades. In this regard, a single channel wavelet thresholding based Kalman filter (KF) algorithm is proposed for speech enhancement in this thesis. The wavelet packet (WP) transform is first applied to the noise corrupted speech on a frame-by-frame basis, which decomposes each frame into a number of subbands. A voice activity detector (VAD) is then designed to detect the voiced/unvoiced frames of the subband speech. Based on the VAD result, an adaptive thresholding scheme is applied to each subband speech followed by the WP based reconstruction to obtain the pre-enhanced speech. To achieve a further level of enhancement, an iterative Kalman filter (IKF) is used to process the pre-enhanced speech.
The proposed adaptive thresholding iterative Kalman filtering (AT-IKF) method is evaluated and compared with some existing methods under various noise conditions in terms of segmental SNR and perceptual evaluation of speech quality (PESQ) as two well-known performance indexes. Firstly, we compare the proposed adaptive thresholding (AT) scheme with three other threshold- ing schemes: the non-linear universal thresholding (U-T), the non-linear wavelet packet transform thresholding (WPT-T) and the non-linear SURE thresholding (SURE-T). The experimental results show that the proposed AT scheme can significantly improve the segmental SNR and PESQ for all input SNRs compared with the other existing thresholding schemes. Secondly, extensive computer simulations are conducted to evaluate the proposed AT-IKF as opposed to the AT and the IKF as standalone speech enhancement methods. It is shown that the AT-IKF method still performs the best. Lastly, the proposed ATIKF method is compared with three representative and popular meth- ods: the improved spectral subtraction based speech enhancement algorithm (ISS), the improved Wiener filter based method (IWF) and the representative subband Kalman filter based algorithm (SIKF). Experimental results demonstrate the effectiveness of the proposed method as compared to some previous works both in terms of segmental SNR and PESQ
Single Channel Speech Enhancement using Kalman Filter
The quality and intelligibility of speech conversation are generally degraded by the
surrounding noises. The main objective of speech enhancement (SE) is to eliminate
or reduce such disturbing noises from the degraded speech. Various SE methods have
been proposed in literature. Among them, the Kalman filter (KF) is known to be an
efficient SE method that uses the minimum mean square error (MMSE). However,
most of the conventional KF based speech enhancement methods need access to clean
speech and additive noise information for the state-space model parameters, namely,
the linear prediction coefficients (LPCs) and the additive noise variance estimation,
which is impractical in the sense that in practice, we can access only the noisy speech.
Moreover, it is quite difficult to estimate these model parameters efficiently in the
presence of adverse environmental noises. Therefore, the main focus of this thesis is to
develop single channel speech enhancement algorithms using Kalman filter, where the
model parameters are estimated in noisy conditions. Depending on these parameter
estimation techniques, the proposed SE methods are classified into three approaches
based on non-iterative, iterative, and sub-band iterative KF.
In the first approach, a non-iterative Kalman filter based speech enhancement
algorithm is presented, which operates on a frame-by-frame basis. In this proposed
method, the state-space model parameters, namely, the LPCs and noise variance, are
estimated first in noisy conditions. For LPC estimation, a combined speech smoothing
and autocorrelation method is employed. A new method based on a lower-order
truncated Taylor series approximation of the noisy speech along with a difference
operation serving as high-pass filtering is introduced for the noise variance estimation.
The non-iterative Kalman filter is then implemented with these estimated parameters
effectively.
In order to enhance the SE performance as well as parameter estimation accuracy
in noisy conditions, an iterative Kalman filter based single channel SE method is
proposed as the second approach, which also operates on a frame-by-frame basis.
For each frame, the state-space model parameters of the KF are estimated through
an iterative procedure. The Kalman filtering iteration is first applied to each noisy
speech frame, reducing the noise component to a certain degree. At the end of this
first iteration, the LPCs and other state-space model parameters are re-estimated
using the processed speech frame and the Kalman filtering is repeated for the same
processed frame. This iteration continues till the KF converges or a maximum number
of iterations is reached, giving further enhanced speech frame. The same procedure
will repeat for the following frames until the last noisy speech frame being processed.
For further improving the speech enhancement performance, a sub-band iterative
Kalman filter based SE method is also proposed as the third approach. A wavelet
filter-bank is first used to decompose the noisy speech into a number of sub-bands.
To achieve the best trade-off among the noise reduction, speech intelligibility and
computational complexity, a partial reconstruction scheme based on consecutive mean
squared error (CMSE) is proposed to synthesize the low-frequency (LF) and highfrequency (HF) sub-bands such that the iterative KF is employed only to the partially
reconstructed HF sub-band speech. Finally, the enhanced HF sub-band speech is
combined with the partially reconstructed LF sub-band speech to reconstruct the
full-band enhanced speech.
Experimental results have shown that the proposed KF based SE methods are
capable of reducing adverse environmental noises for a wide range of input SNRs,
and the overall performance of the proposed methods in terms of different evaluation
metrics is superior to some existing state-of-the art SE methods
Offline and real time noise reduction in speech signals using the discrete wavelet packet decomposition
This thesis describes the development of an offline and real time wavelet based speech enhancement system to process speech corrupted with various amounts of white Gaussian noise and other different noise types
Hybrid kalman filtering algorithm with wavelet packet data processing for linear dynamical systems
The paper develops a hybrid algorithm for predicting a linear dynamic system based on a combination of an adaptive Kalman filter with preprocessing using a wavelet packet analysis of the initial data of the background of the system under study.
Being based on Fourier analysis, wavelet analysis and wavelet packet analysis are quite acceptable for time-frequency analysis of a signal, but they cannot be performed recursively and in real time and, therefore, cannot be used for dynamic analysis of random processes. In combination with the Kalman filter, a combination of the characteristics of the multiple-resolution wavelet transform and the recurrent formulas of the Kalman filter in real time is obtained.
Since the original signal is usually given in the form of discrete measurements, to implement their convolution used in the Kalman filter, it is necessary to use cyclic convolutions with periodic continuation of the signal for any time interval. In the case of different values of the original signal at the ends of the considered time interval [0,T], the periodized signal can have large values and sharp different amplitude at the ends of the periodization interval.
To smooth out the values of the periodized signal at the ends of the periodization interval, a cascade decomposition and recovery algorithm was used using Dobshy boundary wavelets with a finite number of moments. Signal recovery is performed in a series of operations comparable to the duration of the time interval under consideration.
The smoothed signal obtained in this way is used as a Kalman filter platform for predicting the dynamic system under study.
Taking into account that the correlation functions of the noise in the observation equation and the phase state of the system are usually unknown, the adaptation of the Kalman filter to these noises (interference) is carried out on the basis of a zeroing sequence. The manuscript does not contain related dat
Reduce The Noise in Speech Signals Using Wavelet Filtering
تنخفض قدرة قنوات البيانات غالبا ما بسبب الضوضاء وتشوه الإشارات المرسلة . يستخدم تخفيض الضوضاء في مجالات مختلفة (حيث لا يمكن عزل الإشارات المرسلة من الضوضاء والتشويه): في التعرف على الكلام ومعالجة الصور وأنظمة الاتصالات المتنقلة ومعالجة الإشارات الطبية والأنظمة الراديوية والرادارية وما إلى ذلك. توضح هذه الورقة مشكلة وجود ضوضاء في إشارات الكلام. ويتم النظر في نموذج ضوضاء غوسية بيضاء مضافة وإضافته إلى إشارة الكلام - نمذجة عملية الضوضاء. حيث تم دراسة الميزات الاساسية للمويجات المستخدمة لتقليل الضوضاء. وتم النظر في الخوارزمية الاساسية لعميلة ازالة الضوضاء باستخدام تقنيات تحليل المويجات. تم بناء التنفيذ العملي للحد من الضوضاء. تم رسم الاشارة الاصلية والاشاره المشوهة والاشاره المستخلصه بعد تقليل الضوضاء. تم تحليل نتائج إلغاء الضوضاء باستخدام أسر مختلفة من المويجات، حيث تم رسم الاشكال للصلات المتبادلة بين إشارات الكلام المشوهة والنظيفة.تقليل الضوضاء نفذ باستخدام برنامج ماتلاب.The capacity of the data channels is often reduced due to noise and distortion of the transmitted signals. Noise reduction is used in various areas (where from noise and distortion the transmitted signals cannot be isolated): speech / speaker recognition, image processing, mobile communication systems, medical signal processing, radio and radar systems, etc. This paper illustrates the problem of the presence of noise in speech signals. A model of additive white Gaussian noise is considered and adding it to the speech signal – modeling of noise process. The main features of wavelets, which used in noise reduction, are described. The main algorithm of the noise cancellation process using wavelet analysis techniques is considered. Carried out the practical implementation of noise reduction. The graphs of the original, noisy and cleaned signals are plotted. An analysis of the results of noise cancellation was carried out using different families of wavelets, graphs of the cross correlation of noisy and clean speech signals are plotted. Noise reduction carried out using Matlab programing
Recommended from our members
Time-frequency analysis based on split spectrum applied to audio and ultrasonic signals
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonSignal processing is a large subject with applications integral to a number of technological fields such as communication, audio, Voice over IP (VoIP), pattern recognition, sonar, radar, ultrasound and medical imaging. Techniques exist for the analysis, modelling, extraction, recognition and synthesis of signals of interest. The focus of this thesis is signal processing for acoustics (both sonic and ultrasonic). In the applications examined, signals of interest are usually incomplete, distorted and/or noisy. Therefore, reconstructing the signal, noise reduction and removal of any distortion/interference are the main goals of the signal processing techniques presented. The primary aim is to study and develop an advanced time-frequency signal processing technique for acoustic applications to enhance the quality of the signals. In the first part of the thesis, a technique is presented that models and maintains the correlation between temporal and spectral parameters of audio signals. A novel Packet Loss Concealment (PLC) method is developed with applications to VoIP, audio broadcasting, and streaming. The problem of modelling the time-varying frequency spectrum in the context of PLC is addressed, and a novel solution is proposed for tracking and using the temporal motion of spectral flow to reconstruct the signal. The proposed method utilises a Time-Frequency Motion (TFM) matrix representation of the audio signal, where each frequency is tagged with a motion vector estimate that is assessed by cross-correlation of the movement of spectral energy within sub-bands across time frames. The missing packets are estimated using extrapolation or interpolation algorithms using a TFM matrix and then inverse transformed to the time-domain for reconstruction of the signal. The proposed method is compared with conventional approaches using objective Performance Evaluation of Speech Quality (PESQ), and subjective Mean Opinion Scores (MOS) in a range of packet loss from 5% to 20%. The evaluation results demonstrate that the proposed algorithm substantially improves performance by an average of 2.85% and 5.9% in terms of PESQ and MOS respectively. In the second part of the thesis, the proposed method is extended and modified to address challenges of excessive coherent noise arising from ultrasonic signals gathered during Guided Wave Testing (GWT). It is an advanced Non-destructive testing technique which is used over several branches of industry to inspect large structures for defects where the structural integrity is of concern. In such systems, signal interpretation can often be challenging due to the multi-modal and dispersive propagation of Ultrasonic Guided Waves (UGWs). The multi-modal and dispersive nature of the received signals hampers the ability to detect defects in a given structure. The Split-Spectrum Processing (SSP) method with application for such signal has been studied and reviewed quantitatively to measure the enhancement in terms of Signal-to-Noise Ratio (SNR) and spatial resolution. In this thesis, the influence of SSP filter bank parameters on these signals is studied and optimised to improve SNR and spatial resolution considerably. The proposed method is compared analytically and experimentally with conventional approaches. The proposed SSP algorithm substantially improves SNR by an average of 30dB. The conclusions reached in this thesis will contribute to the progression of the GWT technique through considerable improvement in defect detection capability.Centre for Electronic Systems Research (CESR) of Brunel University London, The National Structural Integrity Research Centre (NSIRC) and TWI Ltd
Wavelet Theory
The wavelet is a powerful mathematical tool that plays an important role in science and technology. This book looks at some of the most creative and popular applications of wavelets including biomedical signal processing, image processing, communication signal processing, Internet of Things (IoT), acoustical signal processing, financial market data analysis, energy and power management, and COVID-19 pandemic measurements and calculations. The editor’s personal interest is the application of wavelet transform to identify time domain changes on signals and corresponding frequency components and in improving power amplifier behavior
Sensor Signal and Information Processing II
In the current age of information explosion, newly invented technological sensors and software are now tightly integrated with our everyday lives. Many sensor processing algorithms have incorporated some forms of computational intelligence as part of their core framework in problem solving. These algorithms have the capacity to generalize and discover knowledge for themselves and learn new information whenever unseen data are captured. The primary aim of sensor processing is to develop techniques to interpret, understand, and act on information contained in the data. The interest of this book is in developing intelligent signal processing in order to pave the way for smart sensors. This involves mathematical advancement of nonlinear signal processing theory and its applications that extend far beyond traditional techniques. It bridges the boundary between theory and application, developing novel theoretically inspired methodologies targeting both longstanding and emergent signal processing applications. The topic ranges from phishing detection to integration of terrestrial laser scanning, and from fault diagnosis to bio-inspiring filtering. The book will appeal to established practitioners, along with researchers and students in the emerging field of smart sensors processing
Motion Artifacts Correction from Single-Channel EEG and fNIRS Signals using Novel Wavelet Packet Decomposition in Combination with Canonical Correlation Analysis
The electroencephalogram (EEG) and functional near-infrared spectroscopy
(fNIRS) signals, highly non-stationary in nature, greatly suffers from motion
artifacts while recorded using wearable sensors. This paper proposes two robust
methods: i) Wavelet packet decomposition (WPD), and ii) WPD in combination with
canonical correlation analysis (WPD-CCA), for motion artifact correction from
single-channel EEG and fNIRS signals. The efficacy of these proposed techniques
is tested using a benchmark dataset and the performance of the proposed methods
is measured using two well-established performance matrices: i) Difference in
the signal to noise ratio ({\Delta}SNR) and ii) Percentage reduction in motion
artifacts ({\eta}). The proposed WPD-based single-stage motion artifacts
correction technique produces the highest average {\Delta}SNR (29.44 dB) when
db2 wavelet packet is incorporated whereas the greatest average {\eta} (53.48%)
is obtained using db1 wavelet packet for all the available 23 EEG recordings.
Our proposed two-stage motion artifacts correction technique i.e. the WPD-CCA
method utilizing db1 wavelet packet has shown the best denoising performance
producing an average {\Delta}SNR and {\eta} values of 30.76 dB and 59.51%,
respectively for all the EEG recordings. On the other hand, the two-stage
motion artifacts removal technique i.e. WPD-CCA has produced the best average
{\Delta}SNR (16.55 dB, utilizing db1 wavelet packet) and largest average {\eta}
(41.40%, using fk8 wavelet packet). The highest average {\Delta}SNR and {\eta}
using single-stage artifacts removal techniques (WPD) are found as 16.11 dB and
26.40%, respectively for all the fNIRS signals using fk4 wavelet packet. In
both EEG and fNIRS modalities, the percentage reduction in motion artifacts
increases by 11.28% and 56.82%, respectively when two-stage WPD-CCA techniques
are employed.Comment: 25 pages, 10 figures and 2 table
- …