3,910 research outputs found
Penalized Orthogonal Iteration for Sparse Estimation of Generalized Eigenvalue Problem
We propose a new algorithm for sparse estimation of eigenvectors in
generalized eigenvalue problems (GEP). The GEP arises in a number of modern
data-analytic situations and statistical methods, including principal component
analysis (PCA), multiclass linear discriminant analysis (LDA), canonical
correlation analysis (CCA), sufficient dimension reduction (SDR) and invariant
co-ordinate selection. We propose to modify the standard generalized orthogonal
iteration with a sparsity-inducing penalty for the eigenvectors. To achieve
this goal, we generalize the equation-solving step of orthogonal iteration to a
penalized convex optimization problem. The resulting algorithm, called
penalized orthogonal iteration, provides accurate estimation of the true
eigenspace, when it is sparse. Also proposed is a computationally more
efficient alternative, which works well for PCA and LDA problems. Numerical
studies reveal that the proposed algorithms are competitive, and that our
tuning procedure works well. We demonstrate applications of the proposed
algorithm to obtain sparse estimates for PCA, multiclass LDA, CCA and SDR.
Supplementary materials are available online
Randomized Riemannian Preconditioning for Orthogonality Constrained Problems
Optimization problems with (generalized) orthogonality constraints are
prevalent across science and engineering. For example, in computational science
they arise in the symmetric (generalized) eigenvalue problem, in nonlinear
eigenvalue problems, and in electronic structures computations, to name a few
problems. In statistics and machine learning, they arise, for example, in
canonical correlation analysis and in linear discriminant analysis. In this
article, we consider using randomized preconditioning in the context of
optimization problems with generalized orthogonality constraints. Our proposed
algorithms are based on Riemannian optimization on the generalized Stiefel
manifold equipped with a non-standard preconditioned geometry, which
necessitates development of the geometric components necessary for developing
algorithms based on this approach. Furthermore, we perform asymptotic
convergence analysis of the preconditioned algorithms which help to
characterize the quality of a given preconditioner using second-order
information. Finally, for the problems of canonical correlation analysis and
linear discriminant analysis, we develop randomized preconditioners along with
corresponding bounds on the relevant condition number
A D.C. Programming Approach to the Sparse Generalized Eigenvalue Problem
In this paper, we consider the sparse eigenvalue problem wherein the goal is
to obtain a sparse solution to the generalized eigenvalue problem. We achieve
this by constraining the cardinality of the solution to the generalized
eigenvalue problem and obtain sparse principal component analysis (PCA), sparse
canonical correlation analysis (CCA) and sparse Fisher discriminant analysis
(FDA) as special cases. Unlike the -norm approximation to the
cardinality constraint, which previous methods have used in the context of
sparse PCA, we propose a tighter approximation that is related to the negative
log-likelihood of a Student's t-distribution. The problem is then framed as a
d.c. (difference of convex functions) program and is solved as a sequence of
convex programs by invoking the majorization-minimization method. The resulting
algorithm is proved to exhibit \emph{global convergence} behavior, i.e., for
any random initialization, the sequence (subsequence) of iterates generated by
the algorithm converges to a stationary point of the d.c. program. The
performance of the algorithm is empirically demonstrated on both sparse PCA
(finding few relevant genes that explain as much variance as possible in a
high-dimensional gene dataset) and sparse CCA (cross-language document
retrieval and vocabulary selection for music retrieval) applications.Comment: 40 page
Lecture Notes of Tensor Network Contractions
Tensor network (TN), a young mathematical tool of high vitality and great
potential, has been undergoing extremely rapid developments in the last two
decades, gaining tremendous success in condensed matter physics, atomic
physics, quantum information science, statistical physics, and so on. In this
lecture notes, we focus on the contraction algorithms of TN as well as some of
the applications to the simulations of quantum many-body systems. Starting from
basic concepts and definitions, we first explain the relations between TN and
physical problems, including the TN representations of classical partition
functions, quantum many-body states (by matrix product state, tree TN, and
projected entangled pair state), time evolution simulations, etc. These
problems, which are challenging to solve, can be transformed to TN contraction
problems. We present then several paradigm algorithms based on the ideas of the
numerical renormalization group and/or boundary states, including density
matrix renormalization group, time-evolving block decimation,
coarse-graining/corner tensor renormalization group, and several distinguished
variational algorithms. Finally, we revisit the TN approaches from the
perspective of multi-linear algebra (also known as tensor algebra or tensor
decompositions) and quantum simulation. Despite the apparent differences in the
ideas and strategies of different TN algorithms, we aim at revealing the
underlying relations and resemblances in order to present a systematic picture
to understand the TN contraction approaches.Comment: 134 pages, 68 figures. In this version, the manuscript has been
changed into the format of book; new sections about tensor network and
quantum circuits have been adde
Tensor Networks for Big Data Analytics and Large-Scale Optimization Problems
In this paper we review basic and emerging models and associated algorithms
for large-scale tensor networks, especially Tensor Train (TT) decompositions
using novel mathematical and graphical representations. We discus the concept
of tensorization (i.e., creating very high-order tensors from lower-order
original data) and super compression of data achieved via quantized tensor
train (QTT) networks. The purpose of a tensorization and quantization is to
achieve, via low-rank tensor approximations "super" compression, and
meaningful, compact representation of structured data. The main objective of
this paper is to show how tensor networks can be used to solve a wide class of
big data optimization problems (that are far from tractable by classical
numerical methods) by applying tensorization and performing all operations
using relatively small size matrices and tensors and applying iteratively
optimized and approximative tensor contractions.
Keywords: Tensor networks, tensor train (TT) decompositions, matrix product
states (MPS), matrix product operators (MPO), basic tensor operations,
tensorization, distributed representation od data optimization problems for
very large-scale problems: generalized eigenvalue decomposition (GEVD),
PCA/SVD, canonical correlation analysis (CCA).Comment: arXiv admin note: text overlap with arXiv:1403.204
Randomized Dimension Reduction on Massive Data
Scalability of statistical estimators is of increasing importance in modern
applications and dimension reduction is often used to extract relevant
information from data. A variety of popular dimension reduction approaches can
be framed as symmetric generalized eigendecomposition problems. In this paper
we outline how taking into account the low rank structure assumption implicit
in these dimension reduction approaches provides both computational and
statistical advantages. We adapt recent randomized low-rank approximation
algorithms to provide efficient solutions to three dimension reduction methods:
Principal Component Analysis (PCA), Sliced Inverse Regression (SIR), and
Localized Sliced Inverse Regression (LSIR). A key observation in this paper is
that randomization serves a dual role, improving both computational and
statistical performance. This point is highlighted in our experiments on real
and simulated data.Comment: 31 pages, 6 figures, Key Words:dimension reduction, generalized
eigendecompositon, low-rank, supervised, inverse regression, random
projections, randomized algorithms, Krylov subspace method
- …