    Tensor decompositions for learning latent variable models

    This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models

    Enhanced IVA for audio separation in highly reverberant environments

    Blind Audio Source Separation (BASS), inspired by the "cocktail-party problem", has been a leading research application for blind source separation (BSS). This thesis concerns the enhancement of frequency domain convolutive blind source separation (FDCBSS) techniques for audio separation in highly reverberant room environments. Independent component analysis (ICA) is a higher order statistics (HOS) approach commonly used in the BSS framework. When applied to audio FDCBSS, ICA based methods suffer from the permutation problem across the frequency bins of each source. Independent vector analysis (IVA) is an FD-BSS algorithm that theoretically solves the permutation problem by using a multivariate source prior, where the sources are considered to be random vectors. The algorithm allows independence between multivariate source signals, and retains dependency between the source signals within each source vector. The source prior adopted to model the nonlinear dependency structure within the source vectors is crucial to the separation performance of the IVA algorithm. The focus of this thesis is on improving the separation performance of the IVA algorithm in the application of BASS. An alternative multivariate Student's t distribution is proposed as the source prior for the batch IVA algorithm. A Student's t probability density function can better model certain frequency domain speech signals due to its tail dependency property. Then, the nonlinear score function, for the IVA, is derived from the proposed source prior. A novel energy driven mixed super Gaussian and Student's t source prior is proposed for the IVA and FastIVA algorithms. The Student's t distribution, in the mixed source prior, can model the high amplitude data points whereas the super Gaussian distribution can model the lower amplitude information in the speech signals. The ratio of both distributions can be adjusted according to the energy of the observed mixtures to adapt for different types of speech signals. A particular multivariate generalized Gaussian distribution is adopted as the source prior for the online IVA algorithm. The nonlinear score function derived from this proposed source prior contains fourth order relationships between different frequency bins, which provides a more informative and stronger dependency structure and thereby improves the separation performance. An adaptive learning scheme is developed to improve the performance of the online IVA algorithm. The scheme adjusts the learning rate as a function of proximity to the target solutions. The scheme is also accompanied with a novel switched source prior technique taking the best performance properties of the super Gaussian source prior and the generalized Gaussian source prior as the algorithm converges. The methods and techniques, proposed in this thesis, are evaluated with real speech source signals in different simulated and real reverberant acoustic environments. A variety of measures are used within the evaluation criteria of the various algorithms. The experimental results demonstrate improved performance of the proposed methods and their robustness in a wide range of situations

    Efficient Blind Source Separation Algorithms with Applications in Speech and Biomedical Signal Processing

    Blind source separation/extraction (BSS/BSE) is a powerful signal processing method and has been applied extensively in many fields such as biomedical sciences and speech signal processing, to extract a set of unknown input sources from a set of observations. Different algorithms of BSS were proposed in the literature, that need more investigations, related to the extraction approach, computational complexity, convergence speed, type of domain (time or frequency), mixture properties, and extraction performances. This work presents a three new BSS/BSE algorithms based on computing new transformation matrices used to extract the unknown signals. Type of signals considered in this dissertation are speech, Gaussian, and ECG signals. The first algorithm, named as the BSE-parallel linear predictor filter (BSE-PLP), computes a transformation matrix from the the covariance matrix of the whitened data. Then, use the matrix as an input to linear predictor filters whose coefficients being the unknown sources. The algorithm has very fast convergence in two iterations. Simulation results, using speech, Gaussian, and ECG signals, show that the model is capable of extracting the unknown source signals and removing noise when the input signal to noise ratio is varied from -20 dB to 80 dB. The second algorithm, named as the BSE-idempotent transformation matrix (BSE-ITM), computes its transformation matrix in iterative form, with less computational complexity. The proposed method is tested using speech, Gaussian, and ECG signals. Simulation results show that the proposed algorithm significantly separate the source signals with better performance measures as compared with other approaches used in the dissertation. The third algorithm, named null space idempotent transformation matrix (NSITM) has been designed using the principle of null space of the ITM, to separate the unknown sources. Simulation results show that the method is successfully separating speech, Gaussian, and ECG signals from their mixture. The algorithm has been used also to estimate average FECG heart rate. Results indicated considerable improvement in estimating the peaks over other algorithms used in this work

    Enhancing brain-computer interfacing through advanced independent component analysis techniques

    A Brain-computer interface (BCI) is a direct communication system between a brain and an external device in which messages or commands sent by an individual do not pass through the brain’s normal output pathways but is detected through brain signals. Some severe motor impairments, such as Amyothrophic Lateral Sclerosis, head trauma, spinal injuries and other diseases may cause the patients to lose their muscle control and become unable to communicate with the outside environment. Currently no effective cure or treatment has yet been found for these diseases. Therefore using a BCI system to rebuild the communication pathway becomes a possible alternative solution. Among different types of BCIs, an electroencephalogram (EEG) based BCI is becoming a popular system due to EEG’s fine temporal resolution, ease of use, portability and low set-up cost. However EEG’s susceptibility to noise is a major issue to develop a robust BCI. Signal processing techniques such as coherent averaging, filtering, FFT and AR modelling, etc. are used to reduce the noise and extract components of interest. However these methods process the data on the observed mixture domain which mixes components of interest and noise. Such a limitation means that extracted EEG signals possibly still contain the noise residue or coarsely that the removed noise also contains part of EEG signals embedded. Independent Component Analysis (ICA), a Blind Source Separation (BSS) technique, is able to extract relevant information within noisy signals and separate the fundamental sources into the independent components (ICs). The most common assumption of ICA method is that the source signals are unknown and statistically independent. Through this assumption, ICA is able to recover the source signals. Since the ICA concepts appeared in the fields of neural networks and signal processing in the 1980s, many ICA applications in telecommunications, biomedical data analysis, feature extraction, speech separation, time-series analysis and data mining have been reported in the literature. In this thesis several ICA techniques are proposed to optimize two major issues for BCI applications: reducing the recording time needed in order to speed up the signal processing and reducing the number of recording channels whilst improving the final classification performance or at least with it remaining the same as the current performance. These will make BCI a more practical prospect for everyday use. This thesis first defines BCI and the diverse BCI models based on different control patterns. After the general idea of ICA is introduced along with some modifications to ICA, several new ICA approaches are proposed. The practical work in this thesis starts with the preliminary analyses on the Southampton BCI pilot datasets starting with basic and then advanced signal processing techniques. The proposed ICA techniques are then presented using a multi-channel event related potential (ERP) based BCI. Next, the ICA algorithm is applied to a multi-channel spontaneous activity based BCI. The final ICA approach aims to examine the possibility of using ICA based on just one or a few channel recordings on an ERP based BCI. The novel ICA approaches for BCI systems presented in this thesis show that ICA is able to accurately and repeatedly extract the relevant information buried within noisy signals and the signal quality is enhanced so that even a simple classifier can achieve good classification accuracy. In the ERP based BCI application, after multichannel ICA the data just applied to eight averages/epochs can achieve 83.9% classification accuracy whilst the data by coherent averaging can reach only 32.3% accuracy. In the spontaneous activity based BCI, the use of the multi-channel ICA algorithm can effectively extract discriminatory information from two types of singletrial EEG data. The classification accuracy is improved by about 25%, on average, compared to the performance on the unpreprocessed data. The single channel ICA technique on the ERP based BCI produces much better results than results using the lowpass filter. Whereas the appropriate number of averages improves the signal to noise rate of P300 activities which helps to achieve a better classification. These advantages will lead to a reliable and practical BCI for use outside of the clinical laboratory

    Detection and removal of eyeblink artifacts from EEG using wavelet analysis and independent component analysis

    Electrical signals generated by brain activity that are measured by the electroencephalogram can be distorted by electrical activity originating from eyeblinks and eye movements. This thesis proposes a new technique to identify and remove eyeblink artifacts from EEG data. An algorithm using a combination of wavelet analysis and independent component analysis (ICA) is implemented to detect the temporal location of the eyeblink artifact and eliminate it without compromising the integrity of the primary EEG data. The discrete wavelet transform is performed on 10 second epochs of data to detect the occurrence of ocular artifact. ICA is used to separate out the independent components within the data and the temporal locations of the eyeblink are used to remove the artifact and reconstruct the EEG data without that source of distortion. The results obtained indicate that the technique implemented may be robust enough to effectively process EEG data and is capable of removing eyeblink artifacts successfully when they are prominent and the data does not contain a great deal of movement artifact. The results show an 88.68% detection rate, a false positive rate of 4.03%, and an 87.23% removal rate for all eyeblinks that were accurately detected. The statistics obtained compared favorably with work done by others in this field of investigation

    Independent component analysis techniques and their performance evaluation for electroencephalography.

    The ongoing electrical activity of the brain is known as the electroencephalogram (EEG). Evoked potentials (EPs) are voltage deviations in the EEG elicited in association with stimuli. EPs provide clinical information by allowing an insight into neurological processes. The amplitude of EPs is typically several times less than the background EEG. The background EEG has the effect of obscuring the EPs and therefore appropriate signal processing is required for their recovery. The EEG waveforms recorded from electrodes placed on the scalp contains the ongoing background EEG, EPs from various brain sources as well as signal components with sources external to the brain. An example of externally generated signal which is picked up by the electrodes on the scalp is the electrooculogram (EOG). This signal is generated by the eyes when eye movements or blinks are performed. Saccade-related EEG waveforms were recorded from 7 normal subjects. A signal source separation technique, namely the independent component analysis (ICA) algorithm of Bell and Sejnowski (hereafter refereed to as BS_ICA), was employed to analyse the recorded waveforms. The effectiveness of the BS_ICA algorithm as well as that of the ICA algorithm of Cardoso, was investigated for removing ocular artefact (OA) from the EEG. It was quantitavely demonstrated that both ICA algorithms were more effective than the conventional correlation-based techniques for removing the OA from the EEG.A novel iterative synchronised averaging method for EPs was devised. The method optimally synchronised the waveforms from successive trials with respect to the event of interest prior to averaging and thus preserved the features of the signals components that were time-locked to the event. The recorded EEG waveforms were analysed using BS_ICA and saccade-related components (frontal and occipital pre-saccadic potentials, and the lambda wave) were extracted and their scalp topographies were obtained. This initial study highlighted some limitations of the conventional ICA approach of Bell and Sejnowski for analysing saccade-related EEG waveforms.Novel techniques were devised in order to improve the performance of the ICA algorithm of Bell and Sejnowski for extracting the lambda wave EP component. One approach involved designing a template-model that represented the temporal characteristics of a lambda wave. Its incorporation into the BS_ICA algorithm improved the signal source separation ability of the algorithm for extracting the lambda wave from the EEG waveforms. The second approach increased the effective length of the recorded EEG traces prior to their processing by the BS_ICA algorithm. This involved abutting EEG traces from an appropriate number of successive trials (a trial was a set of waveforms recorded from 64 electrode locations in a experiment involving a saccade performance). It was quantitatively demonstrated that the process of abutting EEG waveforms was a valuable pre-processing operation for the ICA algorithm of Bell and Sejnowski when extracting the lambda wave.A Fuzzy logic method was implemented to identify BS_ICA-extracted single-trial saccade-related lambda waves. The method provided an effective means to automate the identification of the lambda waves extracted by BS_ICA. The approach correctly identified the single-trial lambda waves with an Accuracy of 97.4%

    Online source separation in reverberant environments exploiting known speaker locations

    This thesis concerns blind source separation techniques using second order statistics and higher order statistics for reverberant environments. A focus of the thesis is algorithmic simplicity with a view to the algorithms being implemented in their online forms. The main challenge of blind source separation applications is to handle reverberant acoustic environments; a further complication is changes in the acoustic environment such as when human speakers physically move. A novel time-domain method which utilises a pair of finite impulse response filters is proposed. The method of principle angles is defined which exploits a singular value decomposition for their design. The pair of filters are implemented within a generalised sidelobe canceller structure, thus the method can be considered as a beamforming method which cancels one source. An adaptive filtering stage is then employed to recover the remaining source, by exploiting the output of the beamforming stage as a noise reference. A common approach to blind source separation is to use methods that use higher order statistics such as independent component analysis. When dealing with realistic convolutive audio and speech mixtures, processing in the frequency domain at each frequency bin is required. As a result this introduces the permutation problem, inherent in independent component analysis, across the frequency bins. Independent vector analysis directly addresses this issue by modeling the dependencies between frequency bins, namely making use of a source vector prior. An alternative source prior for real-time (online) natural gradient independent vector analysis is proposed. A Student's t probability density function is known to be more suited for speech sources, due to its heavier tails, and is incorporated into a real-time version of natural gradient independent vector analysis. The final algorithm is realised as a real-time embedded application on a floating point Texas Instruments digital signal processor platform. Moving sources, along with reverberant environments, cause significant problems in realistic source separation systems as mixing filters become time variant. A method which employs the pair of cancellation filters, is proposed to cancel one source coupled with an online natural gradient independent vector analysis technique to improve average separation performance in the context of step-wise moving sources. This addresses `dips' in performance when sources move. Results show the average convergence time of the performance parameters is improved. Online methods introduced in thesis are tested using impulse responses measured in reverberant environments, demonstrating their robustness and are shown to perform better than established methods in a variety of situations

    Enhanced independent vector analysis for audio separation in a room environment

    Independent vector analysis (IVA) is studied as a frequency domain blind source separation method, which can theoretically avoid the permutation problem by retaining the dependency between different frequency bins of the same source vector while removing the dependency between different source vectors. This thesis focuses upon improving the performance of independent vector analysis when it is used to solve the audio separation problem in a room environment. A specific stability problem of IVA, i.e. the block permutation problem, is identified and analyzed. Then a robust IVA method is proposed to solve this problem by exploiting the phase continuity of the unmixing matrix. Moreover, an auxiliary function based IVA algorithm with an overlapped chain type source prior is proposed as well to mitigate this problem. Then an informed IVA scheme is proposed which combines the geometric information of the sources from video to solve the problem by providing an intelligent initialization for optimal convergence. The proposed informed IVA algorithm can also achieve a faster convergence in terms of iteration numbers and better separation performance. A pitch based evaluation method is defined to judge the separation performance objectively when the information describing the mixing matrix and sources is missing. In order to improve the separation performance of IVA, an appropriate multivariate source prior is needed to better preserve the dependency structure within the source vectors. A particular multivariate generalized Gaussian distribution is adopted as the source prior. The nonlinear score function derived from this proposed source prior contains the fourth order relationships between different frequency bins, which provides a more informative and stronger dependency structure compared with the original IVA algorithm and thereby improves the separation performance. Copula theory is a central tool to model the nonlinear dependency structure. The t copula is proposed to describe the dependency structure within the frequency domain speech signals due to its tail dependency property, which means if one variable has an extreme value, other variables are expected to have extreme values. A multivariate student's t distribution constructed by using a t copula with the univariate student's t marginal distribution is proposed as the source prior. Then the IVA algorithm with the proposed source prior is derived. The proposed algorithms are tested with real speech signals in different reverberant room environments both using modelled room impulse response and real room recordings. State-of-the-art criteria are used to evaluate the separation performance, and the experimental results confirm the advantage of the proposed algorithms

    Exploitation of source nonstationarity in underdetermined blind source separation with advanced clustering techniques

    The problem of blind source separation (BSS) is investigated. Following the assumption that the time-frequency (TF) distributions of the input sources do not overlap, quadratic TF representation is used to exploit the sparsity of the statistically nonstationary sources. However, separation performance is shown to be limited by the selection of a certain threshold in classifying the eigenvectors of the TF matrices drawn from the observation mixtures. Two methods are, therefore, proposed based on recently introduced advanced clustering techniques, namely Gap statistics and self-splitting competitive learning (SSCL), to mitigate the problem of eigenvector classification. The novel integration of these two approaches successfully overcomes the problem of artificial sources induced by insufficient knowledge of the threshold and enables automatic determination of the number of active sources over the observation. The separation performance is thereby greatly improved. Practical consequences of violating the TF orthogonality assumption in the current approach are also studied, which motivates the proposal of a new solution robust to violation of orthogonality. In this new method, the TF plane is partitioned into appropriate blocks and source separation is thereby carried out in a block-by-block manner. Numerical experiments with linear chirp signals and Gaussian minimum shift keying (GMSK) signals are included which support the improved performance of the proposed approaches
