7 research outputs found

    Newton-Type Methods For Simultaneous Matrix Diagonalization

    Full text link
    This paper proposes a Newton-type method to solve numerically the eigenproblem of several diagonalizable matrices, which pairwise commute. A classical result states that these matrices are simultaneously diagonalizable. From a suitable system of equations associated to this problem, we construct a sequence that converges quadratically towards the solution. This construction is not based on the resolution of a linear system as is the case in the classical Newton method. Moreover, we provide a theoretical analysis of this construction and exhibit a condition to get a quadratic convergence. We also propose numerical experiments, which illustrate the theoretical results.Comment: Calcolo, Springer Verlag, 202

    Single-channel source separation using non-negative matrix factorization

    Get PDF

    Tensor-based regression models and applications

    Get PDF
    Tableau d’honneur de la FacultĂ© des Ă©tudes supĂ©rieures et postdoctorales, 2017-2018Avec l’avancement des technologies modernes, les tenseurs d’ordre Ă©levĂ© sont assez rĂ©pandus et abondent dans un large Ă©ventail d’applications telles que la neuroscience informatique, la vision par ordinateur, le traitement du signal et ainsi de suite. La principale raison pour laquelle les mĂ©thodes de rĂ©gression classiques ne parviennent pas Ă  traiter de façon appropriĂ©e des tenseurs d’ordre Ă©levĂ© est due au fait que ces donnĂ©es contiennent des informations structurelles multi-voies qui ne peuvent pas ĂȘtre capturĂ©es directement par les modĂšles conventionnels de rĂ©gression vectorielle ou matricielle. En outre, la trĂšs grande dimensionnalitĂ© de l’entrĂ©e tensorielle produit une Ă©norme quantitĂ© de paramĂštres, ce qui rompt les garanties thĂ©oriques des approches de rĂ©gression classique. De plus, les modĂšles classiques de rĂ©gression se sont avĂ©rĂ©s limitĂ©s en termes de difficultĂ© d’interprĂ©tation, de sensibilitĂ© au bruit et d’absence d’unicitĂ©. Pour faire face Ă  ces dĂ©fis, nous Ă©tudions une nouvelle classe de modĂšles de rĂ©gression, appelĂ©s modĂšles de rĂ©gression tensor-variable, oĂč les prĂ©dicteurs indĂ©pendants et (ou) les rĂ©ponses dĂ©pendantes prennent la forme de reprĂ©sentations tensorielles d’ordre Ă©levĂ©. Nous les appliquons Ă©galement dans de nombreuses applications du monde rĂ©el pour vĂ©rifier leur efficacitĂ© et leur efficacitĂ©.With the advancement of modern technologies, high-order tensors are quite widespread and abound in a broad range of applications such as computational neuroscience, computer vision, signal processing and so on. The primary reason that classical regression methods fail to appropriately handle high-order tensors is due to the fact that those data contain multiway structural information which cannot be directly captured by the conventional vector-based or matrix-based regression models, causing substantial information loss during the regression. Furthermore, the ultrahigh dimensionality of tensorial input produces huge amount of parameters, which breaks the theoretical guarantees of classical regression approaches. Additionally, the classical regression models have also been shown to be limited in terms of difficulty of interpretation, sensitivity to noise and absence of uniqueness. To deal with these challenges, we investigate a novel class of regression models, called tensorvariate regression models, where the independent predictors and (or) dependent responses take the form of high-order tensorial representations. We also apply them in numerous real-world applications to verify their efficiency and effectiveness. Concretely, we first introduce hierarchical Tucker tensor regression, a generalized linear tensor regression model that is able to handle potentially much higher order tensor input. Then, we work on online local Gaussian process for tensor-variate regression, an efficient nonlinear GPbased approach that can process large data sets at constant time in a sequential way. Next, we present a computationally efficient online tensor regression algorithm with general tensorial input and output, called incremental higher-order partial least squares, for the setting of infinite time-dependent tensor streams. Thereafter, we propose a super-fast sequential tensor regression framework for general tensor sequences, namely recursive higher-order partial least squares, which addresses issues of limited storage space and fast processing time allowed by dynamic environments. Finally, we introduce kernel-based multiblock tensor partial least squares, a new generalized nonlinear framework that is capable of predicting a set of tensor blocks by merging a set of tensor blocks from different sources with a boosted predictive power

    Principled methods for mixtures processing

    Get PDF
    This document is my thesis for getting the habilitation à diriger des recherches, which is the french diploma that is required to fully supervise Ph.D. students. It summarizes the research I did in the last 15 years and also provides the short­term research directions and applications I want to investigate. Regarding my past research, I first describe the work I did on probabilistic audio modeling, including the separation of Gaussian and α­stable stochastic processes. Then, I mention my work on deep learning applied to audio, which rapidly turned into a large effort for community service. Finally, I present my contributions in machine learning, with some works on hardware compressed sensing and probabilistic generative models.My research programme involves a theoretical part that revolves around probabilistic machine learning, and an applied part that concerns the processing of time series arising in both audio and life sciences

    Trennung und SchĂ€tzung der Anzahl von Audiosignalquellen mit Zeit- und FrequenzĂŒberlappung

    Get PDF
    Everyday audio recordings involve mixture signals: music contains a mixture of instruments; in a meeting or conference, there is a mixture of human voices. For these mixtures, automatically separating or estimating the number of sources is a challenging task. A common assumption when processing mixtures in the time-frequency domain is that sources are not fully overlapped. However, in this work we consider some cases where the overlap is severe — for instance, when instruments play the same note (unison) or when many people speak concurrently ("cocktail party") — highlighting the need for new representations and more powerful models. To address the problems of source separation and count estimation, we use conventional signal processing techniques as well as deep neural networks (DNN). We ïŹrst address the source separation problem for unison instrument mixtures, studying the distinct spectro-temporal modulations caused by vibrato. To exploit these modulations, we developed a method based on time warping, informed by an estimate of the fundamental frequency. For cases where such estimates are not available, we present an unsupervised model, inspired by the way humans group time-varying sources (common fate). This contribution comes with a novel representation that improves separation for overlapped and modulated sources on unison mixtures but also improves vocal and accompaniment separation when used as an input for a DNN model. Then, we focus on estimating the number of sources in a mixture, which is important for real-world scenarios. Our work on count estimation was motivated by a study on how humans can address this task, which lead us to conduct listening experiments, conïŹrming that humans are only able to estimate the number of up to four sources correctly. To answer the question of whether machines can perform similarly, we present a DNN architecture, trained to estimate the number of concurrent speakers. Our results show improvements compared to other methods, and the model even outperformed humans on the same task. In both the source separation and source count estimation tasks, the key contribution of this thesis is the concept of “modulation”, which is important to computationally mimic human performance. Our proposed Common Fate Transform is an adequate representation to disentangle overlapping signals for separation, and an inspection of our DNN count estimation model revealed that it proceeds to ïŹnd modulation-like intermediate features.Im Alltag sind wir von gemischten Signalen umgeben: Musik besteht aus einer Mischung von Instrumenten; in einem Meeting oder auf einer Konferenz sind wir einer Mischung menschlicher Stimmen ausgesetzt. FĂŒr diese Mischungen ist die automatische Quellentrennung oder die Bestimmung der Anzahl an Quellen eine anspruchsvolle Aufgabe. Eine hĂ€uïŹge Annahme bei der Verarbeitung von gemischten Signalen im Zeit-Frequenzbereich ist, dass die Quellen sich nicht vollstĂ€ndig ĂŒberlappen. In dieser Arbeit betrachten wir jedoch einige FĂ€lle, in denen die Überlappung immens ist zum Beispiel, wenn Instrumente den gleichen Ton spielen (unisono) oder wenn viele Menschen gleichzeitig sprechen (Cocktailparty) —, so dass neue Signal-ReprĂ€sentationen und leistungsfĂ€higere Modelle notwendig sind. Um die zwei genannten Probleme zu bewĂ€ltigen, verwenden wir sowohl konventionelle Signalverbeitungsmethoden als auch tiefgehende neuronale Netze (DNN). Wir gehen zunĂ€chst auf das Problem der Quellentrennung fĂŒr Unisono-Instrumentenmischungen ein und untersuchen die speziellen, durch Vibrato ausgelösten, zeitlich-spektralen Modulationen. Um diese Modulationen auszunutzen entwickelten wir eine Methode, die auf Zeitverzerrung basiert und eine SchĂ€tzung der Grundfrequenz als zusĂ€tzliche Information nutzt. FĂŒr FĂ€lle, in denen diese SchĂ€tzungen nicht verfĂŒgbar sind, stellen wir ein unĂŒberwachtes Modell vor, das inspiriert ist von der Art und Weise, wie Menschen zeitverĂ€nderliche Quellen gruppieren (Common Fate). Dieser Beitrag enthĂ€lt eine neuartige ReprĂ€sentation, die die Separierbarkeit fĂŒr ĂŒberlappte und modulierte Quellen in Unisono-Mischungen erhöht, aber auch die Trennung in Gesang und Begleitung verbessert, wenn sie in einem DNN-Modell verwendet wird. Im Weiteren beschĂ€ftigen wir uns mit der SchĂ€tzung der Anzahl von Quellen in einer Mischung, was fĂŒr reale Szenarien wichtig ist. Unsere Arbeit an der SchĂ€tzung der Anzahl war motiviert durch eine Studie, die zeigt, wie wir Menschen diese Aufgabe angehen. Dies hat uns dazu veranlasst, eigene Hörexperimente durchzufĂŒhren, die bestĂ€tigten, dass Menschen nur in der Lage sind, die Anzahl von bis zu vier Quellen korrekt abzuschĂ€tzen. Um nun die Frage zu beantworten, ob Maschinen dies Ă€hnlich gut können, stellen wir eine DNN-Architektur vor, die erlernt hat, die Anzahl der gleichzeitig sprechenden Sprecher zu ermitteln. Die Ergebnisse zeigen Verbesserungen im Vergleich zu anderen Methoden, aber vor allem auch im Vergleich zu menschlichen Hörern. Sowohl bei der Quellentrennung als auch bei der SchĂ€tzung der Anzahl an Quellen ist ein Kernbeitrag dieser Arbeit das Konzept der “Modulation”, welches wichtig ist, um die Strategien von Menschen mittels Computern nachzuahmen. Unsere vorgeschlagene Common Fate Transformation ist eine adĂ€quate Darstellung, um die Überlappung von Signalen fĂŒr die Trennung zugĂ€nglich zu machen und eine Inspektion unseres DNN-ZĂ€hlmodells ergab schließlich, dass sich auch hier modulationsĂ€hnliche Merkmale ïŹnden lassen

    Blind multichannel deconvolution and convolutive extensions of canonical polyadic and block term decompositions

    No full text
    © 2017 IEEE. Tensor decompositions such as the canonical polyadic decomposition (CPD) or the block term decomposition (BTD) are basic tools for blind signal separation. Most of the literature concerns instantaneous mixtures/memoryless channels. In this paper, we focus on convolutive extensions. More precisely, we present a connection between convolutive CPD/BTD models and coupled but instantaneous CPD/BTD. We derive a new identifiability condition dedicated to convolutive low-rank factorization problems. We explain that under this condition, the convolutive extension of CPD/BTD can be computed by means of an algebraic method, guaranteeing perfect source separation in the noiseless case. In the inexact case, the algorithm can be used as a cheap initialization for an optimization-based method. We explain that, in contrast to the memoryless case, convolutive signal separation is in certain cases possible despite only two-way diversities (e.g., space × time).status: publishe
    corecore