Search CORE

5,338 research outputs found

Sparse Principal Component Analysis via Rotation and Truncation

Author: Hu Zhenfang
Pan Gang
Wang Yueming
Wu Zhaohui
Publication venue
Publication date: 01/05/2014
Field of study

Sparse principal component analysis (sparse PCA) aims at finding a sparse basis to improve the interpretability over the dense basis of PCA, meanwhile the sparse basis should cover the data subspace as much as possible. In contrast to most of existing work which deal with the problem by adding some sparsity penalties on various objectives of PCA, in this paper, we propose a new method SPCArt, whose motivation is to find a rotation matrix and a sparse basis such that the sparse basis approximates the basis of PCA after the rotation. The algorithm of SPCArt consists of three alternating steps: rotate PCA basis, truncate small entries, and update the rotation matrix. Its performance bounds are also given. SPCArt is efficient, with each iteration scaling linearly with the data dimension. It is easy to choose parameters in SPCArt, due to its explicit physical explanations. Besides, we give a unified view to several existing sparse PCA methods and discuss the connection with SPCArt. Some ideas in SPCArt are extended to GPower, a popular sparse PCA algorithm, to overcome its drawback. Experimental results demonstrate that SPCArt achieves the state-of-the-art performance. It also achieves a good tradeoff among various criteria, including sparsity, explained variance, orthogonality, balance of sparsity among loadings, and computational speed

arXiv.org e-Print Archive

On the Worst-Case Approximability of Sparse PCA

Author: Chan Siu On
Papailiopoulos Dimitris
Rubinstein Aviad
Publication venue
Publication date: 21/07/2015
Field of study

It is well known that Sparse PCA (Sparse Principal Component Analysis) is NP-hard to solve exactly on worst-case instances. What is the complexity of solving Sparse PCA approximately? Our contributions include: 1) a simple and efficient algorithm that achieves an

n^{-1/3}

-approximation; 2) NP-hardness of approximation to within

(1-\varepsilon)

, for some small constant

\varepsilon > 0

; 3) SSE-hardness of approximation to within any constant factor; and 4) an

\exp\exp\left(\Omega\left(\sqrt{\log \log n}\right)\right)

("quasi-quasi-polynomial") gap for the standard semidefinite program.Comment: 20 page

arXiv.org e-Print Archive

Sparse eigenbasis approximation: multiple feature extraction across spatiotemporal scales with application to coherent set identification

Author: Froyland Gary
Rock Christopher P.
Sakellariou Konstantinos
Publication venue: 'Elsevier BV'
Publication date: 16/12/2018
Field of study

The output of spectral clustering is a collection of eigenvalues and eigenvectors that encode important connectivity information about a graph or a manifold. This connectivity information is often not cleanly represented in the eigenvectors and must be disentangled by some secondary procedure. We propose the use of an approximate sparse basis for the space spanned by the leading eigenvectors as a natural, robust, and efficient means of performing this separation. The use of sparsity yields a natural cutoff in this disentanglement procedure and is particularly useful in practical situations when there is no clear eigengap. In order to select a suitable collection of vectors we develop a new Weyl-inspired eigengap heuristic and heuristics based on the sparse basis vectors. We develop an automated eigenvector separation procedure and illustrate its efficacy on examples from time-dependent dynamics on manifolds. In this context, transfer operator approaches are extensively used to find dynamically disconnected regions of phase space, known as almost-invariant sets or coherent sets. The dominant eigenvectors of transfer operators or related operators, such as the dynamic Laplacian, encode dynamic connectivity information. Our sparse eigenbasis approximation (SEBA) methodology streamlines the final stage of transfer operator methods, namely the extraction of almost-invariant or coherent sets from the eigenvectors. It is particularly useful when used on domains with large numbers of coherent sets, and when the coherent sets do not exhaust the phase space, such as in large geophysical datasets

arXiv.org e-Print Archive

Spectral Sparse Representation for Clustering: Evolved from PCA, K-means, Laplacian Eigenmap, and Ratio Cut

Author: Hu Zhenfang
Pan Gang
Wang Yueming
Wu Zhaohui
Publication venue
Publication date: 19/05/2017
Field of study

Dimensionality reduction, cluster analysis, and sparse representation are basic components in machine learning. However, their relationships have not yet been fully investigated. In this paper, we find that the spectral graph theory underlies a series of these elementary methods and can unify them into a complete framework. The methods include PCA, K-means, Laplacian eigenmap (LE), ratio cut (Rcut), and a new sparse representation method developed by us, called spectral sparse representation (SSR). Further, extended relations to conventional over-complete sparse representations (e.g., method of optimal directions, KSVD), manifold learning (e.g., kernel PCA, multidimensional scaling, Isomap, locally linear embedding), and subspace clustering (e.g., sparse subspace clustering, low-rank representation) are incorporated. We show that, under an ideal condition from the spectral graph theory, PCA, K-means, LE, and Rcut are unified together. And when the condition is relaxed, the unification evolves to SSR, which lies in the intermediate between PCA/LE and K-mean/Rcut. An efficient algorithm, NSCrt, is developed to solve the sparse codes of SSR. SSR combines merits of both sides: its sparse codes reduce dimensionality of data meanwhile revealing cluster structure. For its inherent relation to cluster analysis, the codes of SSR can be directly used for clustering. Scut, a clustering approach derived from SSR reaches the state-of-the-art performance in the spectral clustering family. The one-shot solution obtained by Scut is comparable to the optimal result of K-means that are run many times. Experiments on various data sets demonstrate the properties and strengths of SSR, NSCrt, and Scut

arXiv.org e-Print Archive

A Fast deflation Method for Sparse Principal Component Analysis via Subspace Projections

Author: Xu Cong
Yang Min
Zhang Jin
Publication venue
Publication date: 05/12/2019
Field of study

The implementation of conventional sparse principal component analysis (SPCA) on high-dimensional data sets has become a time consuming work. In this paper, a series of subspace projections are constructed efficiently by using Household QR factorization. With the aid of these subspace projections, a fast deflation method, called SPCA-SP, is developed for SPCA. This method keeps a good tradeoff between various criteria, including sparsity, orthogonality, explained variance, balance of sparsity, and computational cost. Comparative experiments on the benchmark data sets confirm the effectiveness of the proposed method.Comment: 4 figures, 2 table

arXiv.org e-Print Archive

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Author: Chen Yuxin
Chi Yuejie
Lu Yue M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/09/2019
Field of study

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.Comment: Invited overview articl

arXiv.org e-Print Archive

Why (and How) Avoid Orthogonal Procrustes in Regularized Multivariate Analysis

Author: Arenas-García Jerónimo
Gómez-Verdejo Vanessa
Muñoz-Romero Sergio
Publication venue
Publication date: 19/09/2016
Field of study

Multivariate Analysis (MVA) comprises a family of well-known methods for feature extraction that exploit correlations among input variables of the data representation. One important property that is enjoyed by most such methods is uncorrelation among the extracted features. Recently, regularized versions of MVA methods have appeared in the literature, mainly with the goal to gain interpretability of the solution. In these cases, the solutions can no longer be obtained in a closed manner, and it is frequent to recur to the iteration of two steps, one of them being an orthogonal Procrustes problem. This letter shows that the Procrustes solution is not optimal from the perspective of the overall MVA method, and proposes an alternative approach based on the solution of an eigenvalue problem. Our method ensures the preservation of several properties of the original methods, most notably the uncorrelation of the extracted features, as demonstrated theoretically and through a collection of selected experiments.Comment: 9 pages; added acknowledgment

arXiv.org e-Print Archive

Optimal linear estimation under unknown nonlinear transform

Author: Caramanis Constantine
Liu Han
Wang Zhaoran
Yi Xinyang
Publication venue
Publication date: 13/05/2015
Field of study

Linear regression studies the problem of estimating a model parameter

\beta^* \in \mathbb{R}^p

, from

n

observations

\{(y_i,\mathbf{x}_i)\}_{i=1}^n

from linear model

y_i = \langle \mathbf{x}_i,\beta^* \rangle + \epsilon_i

. We consider a significant generalization in which the relationship between

\langle \mathbf{x}_i,\beta^* \rangle

and

y_i

is noisy, quantized to a single bit, potentially nonlinear, noninvertible, as well as unknown. This model is known as the single-index model in statistics, and, among other things, it represents a significant generalization of one-bit compressed sensing. We propose a novel spectral-based estimation procedure and show that we can recover

\beta^*

in settings (i.e., classes of link function

f

) where previous algorithms fail. In general, our algorithm requires only very mild restrictions on the (unknown) functional relationship between

y_i

and

\langle \mathbf{x}_i,\beta^* \rangle

. We also consider the high dimensional setting where

\beta^*

is sparse ,and introduce a two-stage nonconvex framework that addresses estimation challenges in high dimensional regimes where

p \gg n

. For a broad class of link functions between

\langle \mathbf{x}_i,\beta^* \rangle

and

y_i

, we establish minimax lower bounds that demonstrate the optimality of our estimators in both the classical and high dimensional regimes.Comment: 25 pages, 3 figure

arXiv.org e-Print Archive

Implementing smooth functions of a Hermitian matrix on a quantum computer

Author: Brierley Steve
Jozsa Richard
Subramanian Sathyawageeswar
Publication venue: 'IOP Publishing'
Publication date: 18/06/2018
Field of study

We review existing methods for implementing smooth functions f(A) of a sparse Hermitian matrix A on a quantum computer, and analyse a further combination of these techniques which has some advantages of simplicity and resource consumption in some cases. Our construction uses the linear combination of unitaries method with Chebyshev polynomial approximations. The query complexity we obtain is O(log C/eps) where eps is the approximation precision, and C>0 is an upper bound on the magnitudes of the derivatives of the function f over the domain of interest. The success probability depends on the 1-norm of the Taylor series coefficients of f, the sparsity d of the matrix, and inversely on the smallest singular value of the target matrix f(A).Comment: 16 page

arXiv.org e-Print Archive

Matrix Equations, Sparse Solvers: M-M.E.S.S.-2.0.1 -- Philosophy, Features and Application for (Parametric) Model

Author: Benner Peter
Köhler Martin
Saak Jens
Publication venue
Publication date: 09/05/2020
Field of study

Matrix equations are omnipresent in (numerical) linear algebra and systems theory. Especially in model order reduction (MOR) they play a key role in many balancing based reduction methods for linear dynamical systems. When these systems arise from spatial discretizations of evolutionary partial differential equations, their coefficient matrices are typically large and sparse. Moreover, the numbers of inputs and outputs of these systems are typically far smaller than the number of spatial degrees of freedom. Then, in many situations the solutions of the corresponding large-scale matrix equations are observed to have low (numerical) rank. This feature is exploited by M-M.E.S.S. to find successively larger low-rank factorizations approximating the solutions. This contribution describes the basic philosophy behind the implementation and the features of the package, as well as its application in the model order reduction of large-scale linear time-invariant (LTI) systems and parametric LTI systems.Comment: 18 pages, 4 figures, 5 table

arXiv.org e-Print Archive