5,338 research outputs found

    Sparse Principal Component Analysis via Rotation and Truncation

    Full text link
    Sparse principal component analysis (sparse PCA) aims at finding a sparse basis to improve the interpretability over the dense basis of PCA, meanwhile the sparse basis should cover the data subspace as much as possible. In contrast to most of existing work which deal with the problem by adding some sparsity penalties on various objectives of PCA, in this paper, we propose a new method SPCArt, whose motivation is to find a rotation matrix and a sparse basis such that the sparse basis approximates the basis of PCA after the rotation. The algorithm of SPCArt consists of three alternating steps: rotate PCA basis, truncate small entries, and update the rotation matrix. Its performance bounds are also given. SPCArt is efficient, with each iteration scaling linearly with the data dimension. It is easy to choose parameters in SPCArt, due to its explicit physical explanations. Besides, we give a unified view to several existing sparse PCA methods and discuss the connection with SPCArt. Some ideas in SPCArt are extended to GPower, a popular sparse PCA algorithm, to overcome its drawback. Experimental results demonstrate that SPCArt achieves the state-of-the-art performance. It also achieves a good tradeoff among various criteria, including sparsity, explained variance, orthogonality, balance of sparsity among loadings, and computational speed

    On the Worst-Case Approximability of Sparse PCA

    Full text link
    It is well known that Sparse PCA (Sparse Principal Component Analysis) is NP-hard to solve exactly on worst-case instances. What is the complexity of solving Sparse PCA approximately? Our contributions include: 1) a simple and efficient algorithm that achieves an n1/3n^{-1/3}-approximation; 2) NP-hardness of approximation to within (1ε)(1-\varepsilon), for some small constant ε>0\varepsilon > 0; 3) SSE-hardness of approximation to within any constant factor; and 4) an expexp(Ω(loglogn))\exp\exp\left(\Omega\left(\sqrt{\log \log n}\right)\right) ("quasi-quasi-polynomial") gap for the standard semidefinite program.Comment: 20 page

    Sparse eigenbasis approximation: multiple feature extraction across spatiotemporal scales with application to coherent set identification

    Full text link
    The output of spectral clustering is a collection of eigenvalues and eigenvectors that encode important connectivity information about a graph or a manifold. This connectivity information is often not cleanly represented in the eigenvectors and must be disentangled by some secondary procedure. We propose the use of an approximate sparse basis for the space spanned by the leading eigenvectors as a natural, robust, and efficient means of performing this separation. The use of sparsity yields a natural cutoff in this disentanglement procedure and is particularly useful in practical situations when there is no clear eigengap. In order to select a suitable collection of vectors we develop a new Weyl-inspired eigengap heuristic and heuristics based on the sparse basis vectors. We develop an automated eigenvector separation procedure and illustrate its efficacy on examples from time-dependent dynamics on manifolds. In this context, transfer operator approaches are extensively used to find dynamically disconnected regions of phase space, known as almost-invariant sets or coherent sets. The dominant eigenvectors of transfer operators or related operators, such as the dynamic Laplacian, encode dynamic connectivity information. Our sparse eigenbasis approximation (SEBA) methodology streamlines the final stage of transfer operator methods, namely the extraction of almost-invariant or coherent sets from the eigenvectors. It is particularly useful when used on domains with large numbers of coherent sets, and when the coherent sets do not exhaust the phase space, such as in large geophysical datasets

    Spectral Sparse Representation for Clustering: Evolved from PCA, K-means, Laplacian Eigenmap, and Ratio Cut

    Full text link
    Dimensionality reduction, cluster analysis, and sparse representation are basic components in machine learning. However, their relationships have not yet been fully investigated. In this paper, we find that the spectral graph theory underlies a series of these elementary methods and can unify them into a complete framework. The methods include PCA, K-means, Laplacian eigenmap (LE), ratio cut (Rcut), and a new sparse representation method developed by us, called spectral sparse representation (SSR). Further, extended relations to conventional over-complete sparse representations (e.g., method of optimal directions, KSVD), manifold learning (e.g., kernel PCA, multidimensional scaling, Isomap, locally linear embedding), and subspace clustering (e.g., sparse subspace clustering, low-rank representation) are incorporated. We show that, under an ideal condition from the spectral graph theory, PCA, K-means, LE, and Rcut are unified together. And when the condition is relaxed, the unification evolves to SSR, which lies in the intermediate between PCA/LE and K-mean/Rcut. An efficient algorithm, NSCrt, is developed to solve the sparse codes of SSR. SSR combines merits of both sides: its sparse codes reduce dimensionality of data meanwhile revealing cluster structure. For its inherent relation to cluster analysis, the codes of SSR can be directly used for clustering. Scut, a clustering approach derived from SSR reaches the state-of-the-art performance in the spectral clustering family. The one-shot solution obtained by Scut is comparable to the optimal result of K-means that are run many times. Experiments on various data sets demonstrate the properties and strengths of SSR, NSCrt, and Scut

    A Fast deflation Method for Sparse Principal Component Analysis via Subspace Projections

    Full text link
    The implementation of conventional sparse principal component analysis (SPCA) on high-dimensional data sets has become a time consuming work. In this paper, a series of subspace projections are constructed efficiently by using Household QR factorization. With the aid of these subspace projections, a fast deflation method, called SPCA-SP, is developed for SPCA. This method keeps a good tradeoff between various criteria, including sparsity, orthogonality, explained variance, balance of sparsity, and computational cost. Comparative experiments on the benchmark data sets confirm the effectiveness of the proposed method.Comment: 4 figures, 2 table

    Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

    Full text link
    Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.Comment: Invited overview articl

    Why (and How) Avoid Orthogonal Procrustes in Regularized Multivariate Analysis

    Full text link
    Multivariate Analysis (MVA) comprises a family of well-known methods for feature extraction that exploit correlations among input variables of the data representation. One important property that is enjoyed by most such methods is uncorrelation among the extracted features. Recently, regularized versions of MVA methods have appeared in the literature, mainly with the goal to gain interpretability of the solution. In these cases, the solutions can no longer be obtained in a closed manner, and it is frequent to recur to the iteration of two steps, one of them being an orthogonal Procrustes problem. This letter shows that the Procrustes solution is not optimal from the perspective of the overall MVA method, and proposes an alternative approach based on the solution of an eigenvalue problem. Our method ensures the preservation of several properties of the original methods, most notably the uncorrelation of the extracted features, as demonstrated theoretically and through a collection of selected experiments.Comment: 9 pages; added acknowledgment

    Optimal linear estimation under unknown nonlinear transform

    Full text link
    Linear regression studies the problem of estimating a model parameter βRp\beta^* \in \mathbb{R}^p, from nn observations {(yi,xi)}i=1n\{(y_i,\mathbf{x}_i)\}_{i=1}^n from linear model yi=xi,β+ϵiy_i = \langle \mathbf{x}_i,\beta^* \rangle + \epsilon_i. We consider a significant generalization in which the relationship between xi,β\langle \mathbf{x}_i,\beta^* \rangle and yiy_i is noisy, quantized to a single bit, potentially nonlinear, noninvertible, as well as unknown. This model is known as the single-index model in statistics, and, among other things, it represents a significant generalization of one-bit compressed sensing. We propose a novel spectral-based estimation procedure and show that we can recover β\beta^* in settings (i.e., classes of link function ff) where previous algorithms fail. In general, our algorithm requires only very mild restrictions on the (unknown) functional relationship between yiy_i and xi,β\langle \mathbf{x}_i,\beta^* \rangle. We also consider the high dimensional setting where β\beta^* is sparse ,and introduce a two-stage nonconvex framework that addresses estimation challenges in high dimensional regimes where pnp \gg n. For a broad class of link functions between xi,β\langle \mathbf{x}_i,\beta^* \rangle and yiy_i, we establish minimax lower bounds that demonstrate the optimality of our estimators in both the classical and high dimensional regimes.Comment: 25 pages, 3 figure

    Implementing smooth functions of a Hermitian matrix on a quantum computer

    Full text link
    We review existing methods for implementing smooth functions f(A) of a sparse Hermitian matrix A on a quantum computer, and analyse a further combination of these techniques which has some advantages of simplicity and resource consumption in some cases. Our construction uses the linear combination of unitaries method with Chebyshev polynomial approximations. The query complexity we obtain is O(log C/eps) where eps is the approximation precision, and C>0 is an upper bound on the magnitudes of the derivatives of the function f over the domain of interest. The success probability depends on the 1-norm of the Taylor series coefficients of f, the sparsity d of the matrix, and inversely on the smallest singular value of the target matrix f(A).Comment: 16 page

    Matrix Equations, Sparse Solvers: M-M.E.S.S.-2.0.1 -- Philosophy, Features and Application for (Parametric) Model

    Full text link
    Matrix equations are omnipresent in (numerical) linear algebra and systems theory. Especially in model order reduction (MOR) they play a key role in many balancing based reduction methods for linear dynamical systems. When these systems arise from spatial discretizations of evolutionary partial differential equations, their coefficient matrices are typically large and sparse. Moreover, the numbers of inputs and outputs of these systems are typically far smaller than the number of spatial degrees of freedom. Then, in many situations the solutions of the corresponding large-scale matrix equations are observed to have low (numerical) rank. This feature is exploited by M-M.E.S.S. to find successively larger low-rank factorizations approximating the solutions. This contribution describes the basic philosophy behind the implementation and the features of the package, as well as its application in the model order reduction of large-scale linear time-invariant (LTI) systems and parametric LTI systems.Comment: 18 pages, 4 figures, 5 table
    corecore