    Alpha-stable low-rank plus residual decomposition for speech enhancement

    International audienceIn this study, we propose a novel probabilistic model for separating clean speech signals from noisy mixtures by decomposing the mixture spectrograms into a structured speech part and a more flexible residual part. The main novelty in our model is that it uses a family of heavy-tailed distributions, so called the α-stable distributions, for modeling the residual signal. We develop an expectation-maximization algorithm for parameter estimation and a Monte Carlo scheme for posterior estimation of the clean speech. Our experiments show that the proposed method outperforms relevant factorization-based algorithms by a significant margin

    Advanced tensor based signal processing techniques for wireless communication systems and biomedical signal processing

    Many observed signals in signal processing applications including wireless communications, biomedical signal processing, image processing, and machine learning are multi-dimensional. Tensors preserve the multi-dimensional structure and provide a natural representation of these signals/data. Moreover, tensors provide often an improved identifiability. Therefore, we benefit from using tensor algebra in the above mentioned applications and many more. In this thesis, we present the benefits of utilizing tensor algebra in two signal processing areas. These include signal processing for MIMO (Multiple-Input Multiple-Output) wireless communication systems and biomedical signal processing. Moreover, we contribute to the theoretical aspects of tensor algebra by deriving new properties and ways of computing tensor decompositions. Often, we only have an element-wise or a slice-wise description of the signal model. This representation of the signal model does not reveal the explicit tensor structure. Therefore, the derivation of all tensor unfoldings is not always obvious. Consequently, exploiting the multi-dimensional structure of these models is not always straightforward. We propose an alternative representation of the element-wise multiplication or the slice-wise multiplication based on the generalized tensor contraction operator. Later in this thesis, we exploit this novel representation and the properties of the contraction operator such that we derive the final tensor models. There exist a number of different tensor decompositions that describe different signal models such as the HOSVD (Higher Order Singular Value Decomposition), the CP/PARAFAC (Canonical Polyadic / PARallel FACtors) decomposition, the BTD (Block Term Decomposition), the PARATUCK2 (PARAfac and TUCker2) decomposition, and the PARAFAC2 (PARAllel FACtors2) decomposition. Among these decompositions, the CP decomposition is most widely spread and used. Therefore, the development of algorithms for the efficient computation of the CP decomposition is important for many applications. The SECSI (Semi-Algebraic framework for approximate CP decomposition via SImultaneaous matrix diagonalization) framework is an efficient and robust tool for the calculation of the approximate low-rank CP decomposition via simultaneous matrix diagonalizations. In this thesis, we present five extensions of the SECSI framework that reduce the computational complexity of the original framework and/or introduce constraints to the factor matrices. Moreover, the PARAFAC2 decomposition and the PARATUCK2 decomposition are usually described using a slice-wise notation that can be expressed in terms of the generalized tensor contraction as proposed in this thesis. We exploit this novel representation to derive explicit tensor models for the PARAFAC2 decomposition and the PARATUCK2 decomposition. Furthermore, we use the PARAFAC2 model to derive an ALS (Alternating Least-Squares) algorithm for the computation of the PARAFAC2 decomposition. Moreover, we exploit the novel contraction properties for element wise and slice-wise multiplications to model MIMO multi-carrier wireless communication systems. We show that this very general model can be used to derive the tensor model of the received signal for MIMO-OFDM (Multiple-Input Multiple-Output - Orthogonal Frequency Division Multiplexing), Khatri-Rao coded MIMO-OFDM, and randomly coded MIMO-OFDM systems. We propose the transmission techniques Khatri-Rao coding and random coding in order to impose an additional tensor structure of the transmit signal tensor that otherwise does not have a particular structure. Moreover, we show that this model can be extended to other multi-carrier techniques such as GFDM (Generalized Frequency Division Multiplexing). Utilizing these models at the receiver side, we design several types for receivers for these systems that outperform the traditional matrix based solutions in terms of the symbol error rate. In the last part of this thesis, we show the benefits of using tensor algebra in biomedical signal processing by jointly decomposing EEG (ElectroEncephaloGraphy) and MEG (MagnetoEncephaloGraphy) signals. EEG and MEG signals are usually acquired simultaneously, and they capture aspects of the same brain activity. Therefore, EEG and MEG signals can be decomposed using coupled tensor decompositions such as the coupled CP decomposition. We exploit the proposed coupled SECSI framework (one of the proposed extensions of the SECSI framework) for the computation of the coupled CP decomposition to first validate and analyze the photic driving effect. Moreover, we validate the effects of scull defects on the measurement EEG and MEG signals by means of a joint EEG-MEG decomposition using the coupled SECSI framework. Both applications show that we benefit from coupled tensor decompositions and the coupled SECSI framework is a very practical tool for the analysis of biomedical data.Zahlreiche messbare Signale in verschiedenen Bereichen der digitalen Signalverarbeitung, z.B. in der drahtlosen Kommunikation, im Mobilfunk, biomedizinischen Anwendungen, der Bild- oder akustischen Signalverarbeitung und dem maschinellen Lernen sind mehrdimensional. Tensoren erhalten die mehrdimensionale Struktur und stellen eine natürliche Darstellung dieser Signale/Daten dar. Darüber hinaus bieten Tensoren oft eine verbesserte Trennbarkeit von enthaltenen Signalkomponenten. Daher profitieren wir von der Verwendung der Tensor-Algebra in den oben genannten Anwendungen und vielen mehr. In dieser Arbeit stellen wir die Vorteile der Nutzung der Tensor-Algebra in zwei Bereichen der Signalverarbeitung vor: drahtlose MIMO (Multiple-Input Multiple-Output) Kommunikationssysteme und biomedizinische Signalverarbeitung. Darüber hinaus tragen wir zu theoretischen Aspekten der Tensor-Algebra bei, indem wir neue Eigenschaften und Berechnungsmethoden für die Tensor-Zerlegung ableiten. Oftmals verfügen wir lediglich über eine elementweise oder ebenenweise Beschreibung des Signalmodells, welche nicht die explizite Tensorstruktur zeigt. Daher ist die Ableitung aller Tensor-Unfoldings nicht offensichtlich, wodurch die multidimensionale Struktur dieser Modelle nicht trivial nutzbar ist. Wir schlagen eine alternative Darstellung der elementweisen Multiplikation oder der ebenenweisen Multiplikation auf der Grundlage des generalisierten Tensor-Kontraktionsoperators vor. Weiterhin nutzen wir diese neuartige Darstellung und deren Eigenschaften zur Ableitung der letztendlichen Tensor-Modelle. Es existieren eine Vielzahl von Tensor-Zerlegungen, die verschiedene Signalmodelle beschreiben, wie die HOSVD (Higher Order Singular Value Decomposition), CP/PARAFAC (Canonical Polyadic/ PARallel FACtors) Zerlegung, die BTD (Block Term Decomposition), die PARATUCK2-(PARAfac und TUCker2) und die PARAFAC2-Zerlegung (PARAllel FACtors2). Dabei ist die CP-Zerlegung am weitesten verbreitet und wird findet in zahlreichen Gebieten Anwendung. Daher ist die Entwicklung von Algorithmen zur effizienten Berechnung der CP-Zerlegung von besonderer Bedeutung. Das SECSI (Semi-Algebraic Framework for approximate CP decomposition via Simultaneaous matrix diagonalization) Framework ist ein effizientes und robustes Werkzeug zur Berechnung der approximierten Low-Rank CP-Zerlegung durch simultane Matrixdiagonalisierung. In dieser Arbeit stellen wir fünf Erweiterungen des SECSI-Frameworks vor, welche die Rechenkomplexität des ursprünglichen Frameworks reduzieren bzw. Einschränkungen für die Faktormatrizen einführen. Darüber hinaus werden die PARAFAC2- und die PARATUCK2-Zerlegung in der Regel mit einer ebenenweisen Notation beschrieben, die sich in Form der allgemeinen Tensor-Kontraktion, wie sie in dieser Arbeit vorgeschlagen wird, ausdrücken lässt. Wir nutzen diese neuartige Darstellung, um explizite Tensormodelle für diese beiden Zerlegungen abzuleiten. Darüber hinaus verwenden wir das PARAFAC2-Modell, um einen ALS-Algorithmus (Alternating Least-Squares) für die Berechnung der PARAFAC2-Zerlegungen abzuleiten. Weiterhin nutzen wir die neuartigen Kontraktionseigenschaften für elementweise und ebenenweise Multiplikationen, um MIMO Multi-Carrier-Mobilfunksysteme zu modellieren. Wir zeigen, dass dieses sehr allgemeine Modell verwendet werden kann, um das Tensor-Modell des empfangenen Signals für MIMO-OFDM- (Multiple- Input Multiple-Output - Orthogonal Frequency Division Multiplexing), Khatri-Rao codierte MIMO-OFDM- und zufällig codierte MIMO-OFDM-Systeme abzuleiten. Wir schlagen die Übertragungstechniken der Khatri-Rao-Kodierung und zufällige Kodierung vor, um eine zusätzliche Tensor-Struktur des Sendesignal-Tensors einzuführen, welcher gewöhnlich keine bestimmte Struktur aufweist. Darüber hinaus zeigen wir, dass dieses Modell auf andere Multi-Carrier-Techniken wie GFDM (Generalized Frequency Division Multiplexing) erweitert werden kann. Unter Verwendung dieser Modelle auf der Empfängerseite entwerfen wir verschiedene Typen von Empfängern für diese Systeme, die die traditionellen matrixbasierten Lösungen in Bezug auf die Symbolfehlerrate übertreffen. Im letzten Teil dieser Arbeit zeigen wir die Vorteile der Verwendung von Tensor-Algebra in der biomedizinischen Signalverarbeitung durch die gemeinsame Zerlegung von EEG-(ElectroEncephaloGraphy) und MEG- (MagnetoEncephaloGraphy) Signalen. Diese werden in der Regel gleichzeitig erfasst, wobei sie gemeinsame Aspekte derselben Gehirnaktivität beschreiben. Daher können EEG- und MEG-Signale mit gekoppelten Tensor-Zerlegungen wie der gekoppelten CP Zerlegung analysiert werden. Wir nutzen das vorgeschlagene gekoppelte SECSI-Framework (eine der vorgeschlagenen Erweiterungen des SECSI-Frameworks) für die Berechnung der gekoppelten CP Zerlegung, um zunächst den photic driving effect zu validieren und zu analysieren. Darüber hinaus validieren wir die Auswirkungen von Schädeldefekten auf die Messsignale von EEG und MEG durch eine gemeinsame EEG-MEG-Zerlegung mit dem gekoppelten SECSI-Framework. Beide Anwendungen zeigen, dass wir von gekoppelten Tensor-Zerlegungen profitieren, wobei die Methoden des gekoppelten SECSI-Frameworks erfolgreich zur Analyse biomedizinischer Daten genutzt werden können

    Data-driven multivariate and multiscale methods for brain computer interface

    This thesis focuses on the development of data-driven multivariate and multiscale methods for brain computer interface (BCI) systems. The electroencephalogram (EEG), the most convenient means to measure neurophysiological activity due to its noninvasive nature, is mainly considered. The nonlinearity and nonstationarity inherent in EEG and its multichannel recording nature require a new set of data-driven multivariate techniques to estimate more accurately features for enhanced BCI operation. Also, a long term goal is to enable an alternative EEG recording strategy for achieving long-term and portable monitoring. Empirical mode decomposition (EMD) and local mean decomposition (LMD), fully data-driven adaptive tools, are considered to decompose the nonlinear and nonstationary EEG signal into a set of components which are highly localised in time and frequency. It is shown that the complex and multivariate extensions of EMD, which can exploit common oscillatory modes within multivariate (multichannel) data, can be used to accurately estimate and compare the amplitude and phase information among multiple sources, a key for the feature extraction of BCI system. A complex extension of local mean decomposition is also introduced and its operation is illustrated on two channel neuronal spike streams. Common spatial pattern (CSP), a standard feature extraction technique for BCI application, is also extended to complex domain using the augmented complex statistics. Depending on the circularity/noncircularity of a complex signal, one of the complex CSP algorithms can be chosen to produce the best classification performance between two different EEG classes. Using these complex and multivariate algorithms, two cognitive brain studies are investigated for more natural and intuitive design of advanced BCI systems. Firstly, a Yarbus-style auditory selective attention experiment is introduced to measure the user attention to a sound source among a mixture of sound stimuli, which is aimed at improving the usefulness of hearing instruments such as hearing aid. Secondly, emotion experiments elicited by taste and taste recall are examined to determine the pleasure and displeasure of a food for the implementation of affective computing. The separation between two emotional responses is examined using real and complex-valued common spatial pattern methods. Finally, we introduce a novel approach to brain monitoring based on EEG recordings from within the ear canal, embedded on a custom made hearing aid earplug. The new platform promises the possibility of both short- and long-term continuous use for standard brain monitoring and interfacing applications

    Theory, design and application of gradient adaptive lattice filters

    Cortical mechanisms for tinnitus in humans /

    PhD ThesisThis work sought to characterise neurochemical and neurophysiological processes underlying tinnitus in humans. The first study involved invasive brain recordings from a neurosurgical patient, along with experimental manipulation of his tinnitus, to map the cortical system underlying his tinnitus. Widespread tinnitus-linked changes in low- and high-frequency oscillations were observed, along with inter-regional and cross-frequency patterns of communication. The second and third studies compared tinnitus patients to controls matched for age, sex and hearing loss, measuring auditory cortex spontaneous oscillations (with magnetoencephalography) and neurochemical concentrations (with magnetic resonance spectroscopy) respectively. Unlike in previous studies not controlled for hearing loss, there were no group differences in oscillatory activity attributable to tinnitus. However, there was a significant correlation between gamma oscillations (>30Hz) and hearing loss in the tinnitus group, and between delta oscillations (1-4Hz) and perceived tinnitus loudness. In the neurochemical study, tinnitus patients had significantly reduced GABA concentrations compared to matched controls, and within this group there was a positive correlation between choline concentration (potentially linked to acetylcholine and/or neuronal plasticity) and both hearing loss, and subjective tinnitus intensity and distress. In light of present and previous findings, tinnitus may be best explained by a predictive coding model of perception, which was tested in the final experiment. This directly controlled the three main quantities comprising predictive coding models, and found that delta/theta/alpha oscillations (1-12Hz) encoded the precision of predictions, beta oscillations (12-30Hz) encoded changes to predictions, and gamma oscillations represented surprise (unexpectedness of stimuli based on predictions). The work concludes with a predictive coding model of tinnitus that builds upon the present findings and settles unresolved paradoxes in the literature. In this, precursor processes (in varying combinations) synergise to increase the precision associated with spontaneous activity in the auditory pathway to the point where it overrides higher predictions of ‘silence’.Medical Research Council Wellcome Trust and the National Institutes of Healt

    Sensor Signal and Information Processing II

    In the current age of information explosion, newly invented technological sensors and software are now tightly integrated with our everyday lives. Many sensor processing algorithms have incorporated some forms of computational intelligence as part of their core framework in problem solving. These algorithms have the capacity to generalize and discover knowledge for themselves and learn new information whenever unseen data are captured. The primary aim of sensor processing is to develop techniques to interpret, understand, and act on information contained in the data. The interest of this book is in developing intelligent signal processing in order to pave the way for smart sensors. This involves mathematical advancement of nonlinear signal processing theory and its applications that extend far beyond traditional techniques. It bridges the boundary between theory and application, developing novel theoretically inspired methodologies targeting both longstanding and emergent signal processing applications. The topic ranges from phishing detection to integration of terrestrial laser scanning, and from fault diagnosis to bio-inspiring filtering. The book will appeal to established practitioners, along with researchers and students in the emerging field of smart sensors processing

    Sleep Stage Classification: A Deep Learning Approach

    Sleep occupies significant part of human life. The diagnoses of sleep related disorders are of great importance. To record specific physical and electrical activities of the brain and body, a multi-parameter test, called polysomnography (PSG), is normally used. The visual process of sleep stage classification is time consuming, subjective and costly. To improve the accuracy and efficiency of the sleep stage classification, automatic classification algorithms were developed. In this research work, we focused on pre-processing (filtering boundaries and de-noising algorithms) and classification steps of automatic sleep stage classification. The main motivation for this work was to develop a pre-processing and classification framework to clean the input EEG signal without manipulating the original data thus enhancing the learning stage of deep learning classifiers. For pre-processing EEG signals, a lossless adaptive artefact removal method was proposed. Rather than other works that used artificial noise, we used real EEG data contaminated with EOG and EMG for evaluating the proposed method. The proposed adaptive algorithm led to a significant enhancement in the overall classification accuracy. In the classification area, we evaluated the performance of the most common sleep stage classifiers using a comprehensive set of features extracted from PSG signals. Considering the challenges and limitations of conventional methods, we proposed two deep learning-based methods for classification of sleep stages based on Stacked Sparse AutoEncoder (SSAE) and Convolutional Neural Network (CNN). The proposed methods performed more efficiently by eliminating the need for conventional feature selection and feature extraction steps respectively. Moreover, although our systems were trained with lower number of samples compared to the similar studies, they were able to achieve state of art accuracy and higher overall sensitivity

    Principled methods for mixtures processing

    This document is my thesis for getting the habilitation à diriger des recherches, which is the french diploma that is required to fully supervise Ph.D. students. It summarizes the research I did in the last 15 years and also provides the short­term research directions and applications I want to investigate. Regarding my past research, I first describe the work I did on probabilistic audio modeling, including the separation of Gaussian and α­stable stochastic processes. Then, I mention my work on deep learning applied to audio, which rapidly turned into a large effort for community service. Finally, I present my contributions in machine learning, with some works on hardware compressed sensing and probabilistic generative models.My research programme involves a theoretical part that revolves around probabilistic machine learning, and an applied part that concerns the processing of time series arising in both audio and life sciences