40 research outputs found

    Bounded Simplex-Structured Matrix Factorization: Algorithms, Identifiability and Applications

    Full text link
    In this paper, we propose a new low-rank matrix factorization model dubbed bounded simplex-structured matrix factorization (BSSMF). Given an input matrix XX and a factorization rank rr, BSSMF looks for a matrix WW with rr columns and a matrix HH with rr rows such that X≈WHX \approx WH where the entries in each column of WW are bounded, that is, they belong to given intervals, and the columns of HH belong to the probability simplex, that is, HH is column stochastic. BSSMF generalizes nonnegative matrix factorization (NMF), and simplex-structured matrix factorization (SSMF). BSSMF is particularly well suited when the entries of the input matrix XX belong to a given interval; for example when the rows of XX represent images, or XX is a rating matrix such as in the Netflix and MovieLens datasets where the entries of XX belong to the interval [1,5][1,5]. The simplex-structured matrix HH not only leads to an easily understandable decomposition providing a soft clustering of the columns of XX, but implies that the entries of each column of WHWH belong to the same intervals as the columns of WW. In this paper, we first propose a fast algorithm for BSSMF, even in the presence of missing data in XX. Then we provide identifiability conditions for BSSMF, that is, we provide conditions under which BSSMF admits a unique decomposition, up to trivial ambiguities. Finally, we illustrate the effectiveness of BSSMF on two applications: extraction of features in a set of images, and the matrix completion problem for recommender systems.Comment: 14 pages, new title, new numerical experiments on synthetic data, clarifications of several parts of the paper, run times adde

    Unsupervised Learning of Latent Structure from Linear and Nonlinear Measurements

    Get PDF
    University of Minnesota Ph.D. dissertation. June 2019. Major: Electrical Engineering. Advisor: Nicholas Sidiropoulos. 1 computer file (PDF); xii, 118 pages.The past few decades have seen a rapid expansion of our digital world. While early dwellers of the Internet exchanged simple text messages via email, modern citizens of the digital world conduct a much richer set of activities online: entertainment, banking, booking for restaurants and hotels, just to name a few. In our digitally enriched lives, we not only enjoy great convenience and efficiency, but also leave behind massive amounts of data that offer ample opportunities for improving these digital services, and creating new ones. Meanwhile, technical advancements have facilitated the emergence of new sensors and networks, that can measure, exchange and log data about real world events. These technologies have been applied to many different scenarios, including environmental monitoring, advanced manufacturing, healthcare, and scientific research in physics, chemistry, bio-technology and social science, to name a few. Leveraging the abundant data, learning-based and data-driven methods have become a dominating paradigm across different areas, with data analytics driving many of the recent developments. However, the massive amount of data also bring considerable challenges for analytics. Among them, the collected data are often high-dimensional, with the true knowledge and signal of interest hidden underneath. It is of great importance to reduce data dimension, and transform the data into the right space. In some cases, the data are generated from certain generative models that are identifiable, making it possible to reduce the data back to the original space. In addition, we are often interested in performing some analysis on the data after dimensionality reduction (DR), and it would be helpful to be mindful about these subsequent analysis steps when performing DR, as latent structures can serve as a valuable prior. Based on this reasoning, we develop two methods, one for the linear generative model case, and the other one for the nonlinear case. In a related setting, we study parameter estimation under unknown nonlinear distortion. In this case, the unknown nonlinearity in measurements poses a severe challenge. In practice, various mechanisms can introduce nonlinearity in the measured data. To combat this challenge, we put forth a nonlinear mixture model, which is well-grounded in real world applications. We show that this model is in fact identifiable up to some trivial indeterminancy. We develop an efficient algorithm to recover latent parameters of this model, and confirm the effectiveness of our theory and algorithm via numerical experiments

    A condition number for the tensor rank decomposition

    Get PDF
    The tensor rank decomposition problem consists of recovering the unique set of parameters representing a robustly identifiable low-rank tensor when the coordinate representation of the tensor is presented as input. A condition number for this problem measuring the sensitivity of the parameters to an infinitesimal change to the tensor is introduced and analyzed. It is demonstrated that the absolute condition number coincides with the inverse of the least singular value of Terracini's matrix. Several basic properties of this condition number are investigated.Comment: 45 pages, 4 figure

    Contributions to theory and algorithms of independent component analysis and signal separation

    Get PDF
    This thesis addresses the problem of blind signal separation (BSS) using independent component analysis (ICA). In blind signal separation, signals from multiple sources arrive simultaneously at a sensor array, so that each sensor array output contains a mixture of source signals. Sets of sensor outputs are processed to recover the source signals or to identify the mixing system. The term blind refers to the fact that no explicit knowledge of source signals or mixing system is available. Independent component analysis approach uses statistical independence of the source signals to solve the blind signal separation problems. Application domains for the material presented in this thesis include communications, biomedical, audio, image, and sensor array signal processing. In this thesis reliable algorithms for ICA-based blind source separation are developed. In blind source separation problem the goal is to recover all original source signals using the observed mixtures only. The objective is to develop algorithms that are either adaptive to unknown source distributions or do not need to utilize the source distribution information at all. Two parametric methods that can adapt to a wide class of source distributions including skewed distributions are proposed. Another nonparametric technique with desirable large sample properties is also proposed. It is based on characteristic functions and thereby avoids the need to model the source distributions. Experimental results showing reliable performance are given on all of the presented methods. In this thesis theoretical conditions under which instantaneous ICA-based blind signal processing problems can be solved are established. These results extend the celebrated results by Comon of the traditional linear real-valued model. The results are further extended to complex-valued signals and to nonlinear mixing systems. Conditions for identification, uniqueness, and separation are established both for real and complex-valued linear models, and for a proposed class of non-linear mixing systems.reviewe

    Statistical signal processing of nonstationary tensor-valued data

    Get PDF
    Real-world signals, such as the evolution of three-dimensional vector fields over time, can exhibit highly structured probabilistic interactions across their multiple constitutive dimensions. This calls for analysis tools capable of directly capturing the inherent multi-way couplings present in such data. Yet, current analyses typically employ multivariate matrix models and their associated linear algebras which are agnostic to the global data structure and can only describe local linear pairwise relationships between data entries. To address this issue, this thesis uses the property of linear separability -- a notion intrinsic to multi-dimensional data structures called tensors -- as a linchpin to consider the probabilistic, statistical and spectral separability under one umbrella. This helps to both enhance physical meaning in the analysis and reduce the dimensionality of tensor-valued problems. We first introduce a new identifiable probability distribution which appropriately models the interactions between random tensors, whereby linear relationships are considered between tensor fibres as opposed to between individual entries as in standard matrix analysis. Unlike existing models, the proposed tensor probability distribution formulation is shown to yield a unique maximum likelihood estimator which is demonstrated to be statistically efficient. Both matrices and vectors are lower-order tensors, and this gives us a unique opportunity to consider some matrix signal processing models under the more powerful framework of multilinear tensor algebra. By introducing a model for the joint distribution of multiple random tensors, it is also possible to treat random tensor regression analyses and subspace methods within a unified separability framework. Practical utility of the proposed analysis is demonstrated through case studies over synthetic and real-world tensor-valued data, including the evolution over time of global atmospheric temperatures and international interest rates. Another overarching theme in this thesis is the nonstationarity inherent to real-world signals, which typically consist of both deterministic and stochastic components. This thesis aims to help bridge the gap between formal probabilistic theory of stochastic processes and empirical signal processing methods for deterministic signals by providing a spectral model for a class of nonstationary signals, whereby the deterministic and stochastic time-domain signal properties are designated respectively by the first- and second-order moments of the signal in the frequency domain. By virtue of the assumed probabilistic model, novel tests for nonstationarity detection are devised and demonstrated to be effective in low-SNR environments. The proposed spectral analysis framework, which is intrinsically complex-valued, is facilitated by augmented complex algebra in order to fully capture the joint distribution of the real and imaginary parts of complex random variables, using a compact formulation. Finally, motivated by the need for signal processing algorithms which naturally cater for the nonstationarity inherent to real-world tensors, the above contributions are employed simultaneously to derive a general statistical signal processing framework for nonstationary tensors. This is achieved by introducing a new augmented complex multilinear algebra which allows for a concise description of the multilinear interactions between the real and imaginary parts of complex tensors. These contributions are further supported by new physically meaningful empirical results on the statistical analysis of nonstationary global atmospheric temperatures.Open Acces

    Scalable Learning Adaptive to Unknown Dynamics and Graphs

    Get PDF
    University of Minnesota Ph.D. dissertation.June 2019. Major: Electrical/Computer Engineering. Advisor: Georgios B. Giannakis. 1 computer file (PDF); xii, 174 pages.With the scale of information growing every day, the key challenges in machine learning include the high-dimensionality and sheer volume of feature vectors that may consist of real and categorical data, as well as the speed and the typically streaming format of data acquisition that may also entail outliers and misses. The latter may be present, either unintentionally or intentionally, in order to cope with scalability, privacy, and adversarial behavior. These challenges provide ample opportunities for algorithmic and analytical innovations in online and nonlinear subspace learning approaches. Among the available nonlinear learning tools, those based on kernels have merits that are well documented. However, most rely on a preselected kernel, whose prudent choice presumes task-specific prior information that is generally not available. It is also known that kernel-based methods do not scale well with the size or dimensionality of the data at hand. Besides data science, the urgent need for scalable tools is a core issue also in network science that has recently emerged as a means of collectively understanding the behavior of complex interconnected entities. The rich spectrum of application domains comprises communication, social, financial, gene-regulatory, brain, and power networks, to name a few. Prominent tasks in all network science applications are those of topology identification and inference of nodal processes evolving over graphs. Most contemporary graph-driven inference approaches rely on linear and static models that are simple and tractable, but also presume that the nodal processes are directly observable. To cope with these challenges, the present thesis first introduces a novel online categorical subspace learning approach to track the latent structure of categorical data `on the fly.' Leveraging the random feature approximation, it then develops an adaptive online multi-kernel learning approach (termed AdaRaker), which accounts not only for data-driven learning of the kernel combination, but also for the unknown dynamics. Performance analysis is provided in terms of both static and dynamic regrets to quantify the novel learning function approximation. In addition, the thesis introduces a kernel-based topology identification approach that can even account for nonlinear dependencies among nodes and across time. To cope with nodal processes that may not be directly observable in certain applications, tensor-based algorithms that leverage piecewise stationary statistics of nodal processes are developed, and pertinent identifiability conditions are established. To facilitate real-time operation and inference of time-varying networks, an adaptive tensor decomposition based scheme is put forth to track the topologies of time-varying networks. Last but not least, the present thesis offers a unifying framework to deal with various learning tasks over possibly dynamic networks. These tasks include dimensionality reduction, classification, and clustering. Tests on both synthetic and real datasets from the aforementioned application domains are carried out to showcase the effectiveness of the novel algorithms throughout
    corecore