14 research outputs found

    A Comprehensive Approach for Sparse Principle Component Analysis using Regularized Singular Value Decomposition

    Get PDF
    Principle component analysis (PCA) has been a widely used tool for statistics and data analysis for many years. A good result of PCA should be both interpretable and accurate. However, neither interpretability nor accuracy could be achieved well in “big data” scenarios where there are large numbers of original variables. Therefore people developed sparse PCA, in which obtained principle components (PCs) are linear combinations of a limited number of original variables, which yields good interpretability. In addition, some theoretical results showed that, when the genuine model is sparse, PCs obtained via sparse PCA instead of traditional PCA are consistent estimators. These aspects have made sparse PCA a hot research topic in recent years. In this dissertation, we developed a comprehensive and systematic way for doing sparse PCA by using an SVD-based approach. In detail, we proposed the formulation and algorithm and showed its consistency and convergence. We even showed convergence to global optima using a limited number of trials, which is a breakthrough in sparse PCA area. In addition, to guarantee orthogonality or uncorrelatedness when multiple PCs are extracted, we developed a method for sparse PCA with orthogonal constraint, proposed its algorithm, and showed the convergence. In addition, to deal with missing values in the design matrix which often happens in reality, we developed a method for sparse PCA with missing values, proposed its algorithm, and showed the convergence. Moreover, to provide a good way of selecting tuning parameter in these formulations, we designed an entry-wise cross validation method based on sparse PCA with missing values. All these contributions and breakthroughs make our results practically useful and theoretically complete. Simulation study and real world data analysis are also provided, which showed that our method has competing results with others in “without missing” cases, and good results in “with missing” cases in which currently we are the only practical method

    Projection Based Models for High Dimensional Data

    Get PDF
    In recent years, many machine learning applications have arisen which deal with the problem of finding patterns in high dimensional data. Principal component analysis (PCA) has become ubiquitous in this setting. PCA performs dimensionality reduction by estimating latent factors which minimise the reconstruction error between the original data and its low-dimensional projection. We initially consider a situation where influential observations exist within the dataset which have a large, adverse affect on the estimated PCA model. We propose a measure of “predictive influence” to detect these points based on the contribution of each point to the leave-one-out reconstruction error of the model using an analytic PRedicted REsidual Sum of Squares (PRESS) statistic. We then develop a robust alternative to PCA to deal with the presence of influential observations and outliers which minimizes the predictive reconstruction error. In some applications there may be unobserved clusters in the data, for which fitting PCA models to subsets of the data would provide a better fit. This is known as the subspace clustering problem. We develop a novel algorithm for subspace clustering which iteratively fits PCA models to subsets of the data and assigns observations to clusters based on their predictive influence on the reconstruction error. We study the convergence of the algorithm and compare its performance to a number of subspace clustering methods on simulated data and in real applications from computer vision involving clustering object trajectories in video sequences and images of faces. We extend our predictive clustering framework to a setting where two high-dimensional views of data have been obtained. Often, only either clustering or predictive modelling is performed between the views. Instead, we aim to recover clusters which are maximally predictive between the views. In this setting two block partial least squares (TB-PLS) is a useful model. TB-PLS performs dimensionality reduction in both views by estimating latent factors that are highly predictive. We fit TB-PLS models to subsets of data and assign points to clusters based on their predictive influence under each model which is evaluated using a PRESS statistic. We compare our method to state of the art algorithms in real applications in webpage and document clustering and find that our approach to predictive clustering yields superior results. Finally, we propose a method for dynamically tracking multivariate data streams based on PLS. Our method learns a linear regression function from multivariate input and output streaming data in an incremental fashion while also performing dimensionality reduction and variable selection. Moreover, the recursive regression model is able to adapt to sudden changes in the data generating mechanism and also identifies the number of latent factors. We apply our method to the enhanced index tracking problem in computational finance

    Jack of all trades, Master of None: The Trade-offs in Sparse PCA Methods for Diverse Purposes

    Get PDF
    Sparse algorithms are becoming increasingly popular in data science research because they can identify and select the most relevant variables in a dataset while minimizing overfitting. However, sparse algorithms present unique challenges when dealing with social data, such as data integration (heterogeneity) and the need to account for complex social interactions and dynamics. Throughout this thesis, I focused on researching the sparse Principal Component Analysis (sPCA) problem. I have explored and developed sPCA algorithms that can effectively identify and select the essential features in a dataset, reducing its dimensionality or underlying factors in the data. Specifically, I examined sPCA methods that utilize sparsity-inducing penalties and cardinality constraints to achieve sparsity in the solution

    Anisotropic Adaptivity and Subgrid Scale Modelling for the Solution of the Neutron Transport Equation with an Emphasis on Shielding Applications

    No full text
    This thesis demonstrates advanced new discretisation and adaptive meshing technologies that improve the accuracy and stability of using finite element discretisations applied to the Boltzmann transport equation (BTE). This equation describes the advective transport of neutral particles such as neutrons and photons within a domain. The BTE is difficult to solve, due to its large phase space (three dimensions of space, two of angle and one each of energy and time) and the presence of non-physical oscillations in many situations. This work explores the use of a finite element method that combines the advantages of the two schemes: the discontinuous and continuous Galerkin methods. The new discretisation uses multiscale (subgrid) finite elements that work locally within each element in the finite element mesh in addition to a global, continuous, formulation. The use of higher order functions that describe the variation of the angular flux over each element is also explored using these subgrid finite element schemes. In addition to the spatial discretisation, methods have also been developed to optimise the finite element mesh in order to reduce resulting errors in the solution over the domain, or locally in situations where there is a goal of specific interest (such as a dose in a detector region). The chapters of this thesis have been structured to be submitted individually for journal publication, and are arranged as follows. Chapter 1 introduces the reader to motivation behind the research contained within this thesis. Chapter 2 introduces the forms of the BTE that are used within this thesis. Chapter 3 provides the methods that are used, together with examples, of the validation and verification of the software that was developed as a result of this work, the transport code RADIANT. Chapter 4 introduces the inner element subgrid scale finite element discretisation of the BTE that forms the basis of the discretisations within RADIANT and explores its convergence and computational times on a set of benchmark problems. Chapter 5 develops the error metrics that are used to optimise the mesh in order to reduce the discretisation error within a finite element mesh using anisotropic adaptivity that can use elongated elements that accurately resolves computational demanding regions, such as in the presence of shocks. The work of this chapter is then extended in Chapter 6 that forms error metrics for goal based adaptivity to minimise the error in a detector response. Finally, conclusions from this thesis and suggestions for future work that may be explored are discussed in Chapter 7.Open Acces

    High Dimensional Separable Representations for Statistical Estimation and Controlled Sensing.

    Full text link
    This thesis makes contributions to a fundamental set of high dimensional problems in the following areas: (1) performance bounds for high dimensional estimation of structured Kronecker product covariance matrices, (2) optimal query design for a centralized collaborative controlled sensing system used for target localization, and (3) global convergence theory for decentralized controlled sensing systems. Separable approximations are effective dimensionality reduction techniques for high dimensional problems. In multiple modality and spatio-temporal signal processing, separable models for the underlying covariance are exploited for improved estimation accuracy and reduced computational complexity. In query- based controlled sensing, estimation performance is greatly optimized at the expense of query design. Multi-agent controlled sensing systems for target localization consist of a set of agents that collaborate to estimate the location of an unknown target. In the centralized setting, for a large number of agents and/or high- dimensional targets, separable representations of the fusion center’s query policies are exploited to maintain tractability. For large-scale sensor networks, decentralized estimation methods are of primary interest, under which agents obtain new noisy information as a function of their current belief and exchange local beliefs with their neighbors. Here, separable representations of the temporally evolving information state are exploited to improve robustness and scalability. The results improve upon the current state-of-the-art.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107110/1/ttsili_1.pd

    Signals on Networks: Random Asynchronous and Multirate Processing, and Uncertainty Principles

    Get PDF
    The processing of signals defined on graphs has been of interest for many years, and finds applications in a diverse set of fields such as sensor networks, social and economic networks, and biological networks. In graph signal processing applications, signals are not defined as functions on a uniform time-domain grid but they are defined as vectors indexed by the vertices of a graph, where the underlying graph is assumed to model the irregular signal domain. Although analysis of such networked models is not new (it can be traced back to the consensus problem studied more than four decades ago), such models are studied recently from the view-point of signal processing, in which the analysis is based on the "graph operator" whose eigenvectors serve as a Fourier basis for the graph of interest. With the help of graph Fourier basis, a number of topics from classical signal processing (such as sampling, reconstruction, filtering, etc.) are extended to the case of graphs. The main contribution of this thesis is to provide new directions in the field of graph signal processing and provide further extensions of topics in classical signal processing. The first part of this thesis focuses on a random and asynchronous variant of "graph shift," i.e., localized communication between neighboring nodes. Since the dynamical behavior of randomized asynchronous updates is very different from standard graph shift (i.e., state-space models), this part of the thesis focuses on the convergence and stability behavior of such random asynchronous recursions. Although non-random variants of asynchronous state recursions (possibly with non-linear updates) are well-studied problems with early results dating back to the late 60's, this thesis considers the convergence (and stability) in the statistical mean-squared sense and presents the precise conditions for the stability by drawing parallels with switching systems. It is also shown that systems exhibit unexpected behavior under randomized asynchronicity: an unstable system (in the synchronous world) may be stabilized simply by the use of randomized asynchronicity. Moreover, randomized asynchronicity may result in a lower total computational complexity in certain parameter settings. The thesis presents applications of the random asynchronous model in the context of graph signal processing including an autonomous clustering of network of agents, and a node-asynchronous communication protocol that implements a given rational filter on the graph. The second part of the thesis focuses on extensions of the following topics in classical signal processing to the case of graph: multirate processing and filter banks, discrete uncertainty principles, and energy compaction filters for optimal filter design. The thesis also considers an application to the heat diffusion over networks. Multirate systems and filter banks find many applications in signal processing theory and implementations. Despite the possibility of extending 2-channel filter banks to bipartite graphs, this thesis shows that this relation cannot be generalized to M-channel systems on M-partite graphs. As a result, the extension of classical multirate theory to graphs is nontrivial, and such extensions cannot be obtained without certain mathematical restrictions on the graph. The thesis provides the necessary conditions on the graph such that fundamental building blocks of multirate processing remain valid in the graph domain. In particular, it is shown that when the underlying graph satisfies a condition called M-block cyclic property, classical multirate theory can be extended to the graphs. The uncertainty principle is an essential mathematical concept in science and engineering, and uncertainty principles generally state that a signal cannot have an arbitrarily "short" description in the original basis and in the Fourier basis simultaneously. Based on the fact that graph signal processing proposes two different bases (i.e., vertex and the graph Fourier domains) to represent graph signals, this thesis shows that the total number of nonzero elements of a graph signal and its representation in the graph Fourier domain is lower bounded by a quantity depending on the underlying graph. The thesis also presents the necessary and sufficient condition for the existence of 2-sparse and 3-sparse eigenvectors of a connected graph. When such eigenvectors exist, the uncertainty bound is very low, tight, and independent of the global structure of the graph. The thesis also considers the classical spectral concentration problem. In the context of polynomial graph filters, the problem reduces to the polynomial concentration problem studied more generally by Slepian in the 70's. The thesis studies the asymptotic behavior of the optimal solution in the case of narrow bandwidth. Different examples of graphs are also compared in order to show that the maximum energy compaction and the optimal filter depends heavily on the graph spectrum. In the last part, the thesis considers the estimation of the starting time of a heat diffusion process from its noisy measurements when there is a single point source located on a known vertex of a graph with unknown starting time. In particular, the Cramér-Rao lower bound for the estimation problem is derived, and it is shown that for graphs with higher connectivity the problem has a larger lower bound making the estimation problem more difficult.</p
    corecore