74 research outputs found

    A frequency-based BSS technique for speech source separation.

    Get PDF
    Ngan Lai Yin.Thesis (M.Phil.)--Chinese University of Hong Kong, 2003.Includes bibliographical references (leaves 95-100).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Blind Signal Separation (BSS) Methods --- p.4Chapter 1.2 --- Objectives of the Thesis --- p.6Chapter 1.3 --- Thesis Outline --- p.8Chapter 2 --- Blind Adaptive Frequency-Shift (BA-FRESH) Filter --- p.9Chapter 2.1 --- Cyclostationarity Properties --- p.10Chapter 2.2 --- Frequency-Shift (FRESH) Filter --- p.11Chapter 2.3 --- Blind Adaptive FRESH Filter --- p.12Chapter 2.4 --- Reduced-Rank BA-FRESH Filter --- p.14Chapter 2.4.1 --- CSP Method --- p.14Chapter 2.4.2 --- PCA Method --- p.14Chapter 2.4.3 --- Appropriate Choice of Rank --- p.14Chapter 2.5 --- Signal Extraction of Spectrally Overlapped Signals --- p.16Chapter 2.5.1 --- Simulation 1: A Fixed Rank --- p.17Chapter 2.5.2 --- Simulation 2: A Variable Rank --- p.18Chapter 2.6 --- Signal Separation of Speech Signals --- p.20Chapter 2.7 --- Chapter Summary --- p.22Chapter 3 --- Reverberant Environment --- p.23Chapter 3.1 --- Small Room Acoustics Model --- p.23Chapter 3.2 --- Effects of Reverberation to Speech Recognition --- p.27Chapter 3.2.1 --- Short Impulse Response --- p.27Chapter 3.2.2 --- Small Room Impulse Response Modelled by Image Method --- p.32Chapter 3.3 --- Chapter Summary --- p.34Chapter 4 --- Information Theoretic Approach for Signal Separation --- p.35Chapter 4.1 --- Independent Component Analysis (ICA) --- p.35Chapter 4.1.1 --- Kullback-Leibler (K-L) Divergence --- p.37Chapter 4.2 --- Information Maximization (Infomax) --- p.39Chapter 4.2.1 --- Stochastic Gradient Descent and Stability Problem --- p.41Chapter 4.2.2 --- Infomax and ICA --- p.41Chapter 4.2.3 --- Infomax and Maximum Likelihood --- p.42Chapter 4.3 --- Signal Separation by Infomax --- p.43Chapter 4.4 --- Chapter Summary --- p.45Chapter 5 --- Blind Signal Separation (BSS) in Frequency Domain --- p.47Chapter 5.1 --- Convolutive Mixing System --- p.48Chapter 5.2 --- Infomax in Frequency Domain --- p.52Chapter 5.3 --- Adaptation Algorithms --- p.54Chapter 5.3.1 --- Standard Gradient Method --- p.54Chapter 5.3.2 --- Natural Gradient Method --- p.55Chapter 5.3.3 --- Convergence Performance --- p.56Chapter 5.4 --- Subband Adaptation --- p.57Chapter 5.5 --- Energy Weighting --- p.59Chapter 5.6 --- The Permutation Problem --- p.61Chapter 5.7 --- Performance Evaluation --- p.63Chapter 5.7.1 --- De-reverberation Performance Factor --- p.63Chapter 5.7.2 --- De-Noise Performance Factor --- p.63Chapter 5.7.3 --- Spectral Signal-to-noise Ratio (SNR) --- p.65Chapter 5.8 --- Chapter Summary --- p.65Chapter 6 --- Simulation Results and Performance Analysis --- p.67Chapter 6.1 --- Small Room Acoustics Modelled by Image Method --- p.67Chapter 6.2 --- Signal Sources --- p.68Chapter 6.2.1 --- Cantonese Speech --- p.69Chapter 6.2.2 --- Noise --- p.69Chapter 6.3 --- De-Noise and De-Reverberation Performance Analysis --- p.69Chapter 6.3.1 --- Speech and White Noise --- p.73Chapter 6.3.2 --- Speech and Voice Babble Noise --- p.76Chapter 6.3.3 --- Two Female Speeches --- p.79Chapter 6.4 --- Recognition Accuracy Performance Analysis --- p.83Chapter 6.4.1 --- Speech and White Noise --- p.83Chapter 6.4.2 --- Speech and Voice Babble Noise --- p.84Chapter 6.4.3 --- Two Cantonese Speeches --- p.85Chapter 6.5 --- Chapter Summary --- p.87Chapter 7 --- Conclusions and Suggestions for Future Research --- p.88Chapter 7.1 --- Conclusions --- p.88Chapter 7.2 --- Suggestions for Future Research --- p.91Appendices --- p.92A The Proof of Stability Conditions for Stochastic Gradient De- scent Algorithm (Ref. (4.15)) --- p.92Bibliography --- p.9

    Adaptive cancellation of localised environmental noise

    Get PDF
    Noise cancellation systems are useful in applications such as speech and speaker recognition systems where the effects of environmental noise have to be taken into considerations. A robust method for the cancellation of localised noise in noisy speech signals using subband decomposition and adaptive filtering is presented and described in this paper. The subband decomposition technique is based on low complexity octave filters that split the noisy speech input into subsidiary bands. A thresholding technique is then applied to the subbands to determine the presence or absence of environmental noise. This is used to control an adaptive filter which only responds to the noisy parts of the speech spectrum hence localising the adaptation process only on these segments. The Normalised Least Mean Squares algorithm (NLMS) is used for the adaptation process. A comparison with a similar system without localising the environmental noise shows the superior performance of the proposed system. It has been shown to perform better in terms of computational costs and convergence rate when compared to a system that does not take advantage of the information regarding the presence or absence of noise in a specific part of the speech spectrum. More than 35 dB of noise has been eliminated in less iterations than in conventional approach which needs longer time to reach steady state

    Acoustic Solutions for Door Station

    Get PDF
    This thesis investigates how the audio quality in a door station can be improved by using multiple microphones and implementing beamforming. The concept of beamforming is explained, and two beamforming algorithms are implemented. These are tested with different microphone configurations in both simulated and real environments. Three already implemented solutions for single microphones are also tested. The performance of different microphone configurations is analysed, and the beamforming algorithms are compared to the single microphone solutions. Finally a solution for the application is proposed

    On Perceptual Distortion Measures and Parametric Modeling

    Get PDF

    Efficient Multiband Algorithms for Blind Source Separation

    Get PDF
    The problem of blind separation refers to recovering original signals, called source signals, from the mixed signals, called observation signals, in a reverberant environment. The mixture is a function of a sequence of original speech signals mixed in a reverberant room. The objective is to separate mixed signals to obtain the original signals without degradation and without prior information of the features of the sources. The strategy used to achieve this objective is to use multiple bands that work at a lower rate, have less computational cost and a quicker convergence than the conventional scheme. Our motivation is the competitive results of unequal-passbands scheme applications, in terms of the convergence speed. The objective of this research is to improve unequal-passbands schemes by improving the speed of convergence and reducing the computational cost. The first proposed work is a novel maximally decimated unequal-passbands scheme.This scheme uses multiple bands that make it work at a reduced sampling rate, and low computational cost. An adaptation approach is derived with an adaptation step that improved the convergence speed. The performance of the proposed scheme was measured in different ways. First, the mean square errors of various bands are measured and the results are compared to a maximally decimated equal-passbands scheme, which is currently the best performing method. The results show that the proposed scheme has a faster convergence rate than the maximally decimated equal-passbands scheme. Second, when the scheme is tested for white and coloured inputs using a low number of bands, it does not yield good results; but when the number of bands is increased, the speed of convergence is enhanced. Third, the scheme is tested for quick changes. It is shown that the performance of the proposed scheme is similar to that of the equal-passbands scheme. Fourth, the scheme is also tested in a stationary state. The experimental results confirm the theoretical work. For more challenging scenarios, an unequal-passbands scheme with over-sampled decimation is proposed; the greater number of bands, the more efficient the separation. The results are compared to the currently best performing method. Second, an experimental comparison is made between the proposed multiband scheme and the conventional scheme. The results show that the convergence speed and the signal-to-interference ratio of the proposed scheme are higher than that of the conventional scheme, and the computation cost is lower than that of the conventional scheme

    Inverse filtering and principal component analysis techniques for speech dereverberation

    Get PDF
    In this work, we present a single channel approach for early and late reverberation suppression. This approach can be decomposed into two stages. The first stage employs the inverse filter to augment the signal-to-reverberant energy ratio. The second stage uses the kernel PCA algorithm to enhance the obtained dereverberant signal. It consists in extracting the main non-linear features from the speech signal after inverse filtering. Our approach appears to be efficient mainly in far field conditions and in highly reverberant environments

    multi-band acoustic echo canceller

    Get PDF
    Thesis (S.B. and M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (leaves 68-69).by Mingxi Fan.S.B.and M.Eng

    Multichannel Online Dereverberation based on Spectral Magnitude Inverse Filtering

    Full text link
    This paper addresses the problem of multichannel online dereverberation. The proposed method is carried out in the short-time Fourier transform (STFT) domain, and for each frequency band independently. In the STFT domain, the time-domain room impulse response is approximately represented by the convolutive transfer function (CTF). The multichannel CTFs are adaptively identified based on the cross-relation method, and using the recursive least square criterion. Instead of the complex-valued CTF convolution model, we use a nonnegative convolution model between the STFT magnitude of the source signal and the CTF magnitude, which is just a coarse approximation of the former model, but is shown to be more robust against the CTF perturbations. Based on this nonnegative model, we propose an online STFT magnitude inverse filtering method. The inverse filters of the CTF magnitude are formulated based on the multiple-input/output inverse theorem (MINT), and adaptively estimated based on the gradient descent criterion. Finally, the inverse filtering is applied to the STFT magnitude of the microphone signals, obtaining an estimate of the STFT magnitude of the source signal. Experiments regarding both speech enhancement and automatic speech recognition are conducted, which demonstrate that the proposed method can effectively suppress reverberation, even for the difficult case of a moving speaker.Comment: Paper submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing. IEEE Signal Processing Letters, 201
    • …
    corecore