1 research outputs found
Gradient-based Sparse Principal Component Analysis with Extensions to Online Learning
Sparse principal component analysis (PCA) is an important technique for
dimensionality reduction of high-dimensional data. However, most existing
sparse PCA algorithms are based on non-convex optimization, which provide
little guarantee on the global convergence. Sparse PCA algorithms based on a
convex formulation, for example the Fantope projection and selection (FPS),
overcome this difficulty, but are computationally expensive. In this work we
study sparse PCA based on the convex FPS formulation, and propose a new
algorithm that is computationally efficient and applicable to large and
high-dimensional data sets. Nonasymptotic and explicit bounds are derived for
both the optimization error and the statistical accuracy, which can be used for
testing and inference problems. We also extend our algorithm to online learning
problems, where data are obtained in a streaming fashion. The proposed
algorithm is applied to high-dimensional gene expression data for the detection
of functional gene groups