Search CORE

532 research outputs found

Spectral Condition-Number Estimation of Large Sparse Matrices

Author: Avron Haim
Druinsky Alex
Toledo Sivan
Publication venue
Publication date: 30/08/2018
Field of study

We describe a randomized Krylov-subspace method for estimating the spectral condition number of a real matrix A or indicating that it is numerically rank deficient. The main difficulty in estimating the condition number is the estimation of the smallest singular value \sigma_{\min} of A. Our method estimates this value by solving a consistent linear least-squares problem with a known solution using a specific Krylov-subspace method called LSQR. In this method, the forward error tends to concentrate in the direction of a right singular vector corresponding to \sigma_{\min}. Extensive experiments show that the method is able to estimate well the condition number of a wide array of matrices. It can sometimes estimate the condition number when running a dense SVD would be impractical due to the computational cost or the memory requirements. The method uses very little memory (it inherits this property from LSQR) and it works equally well on square and rectangular matrices

arXiv.org e-Print Archive

Subsampling Algorithms for Semidefinite Programming

Author: d'Aspremont Alexandre
Publication venue
Publication date: 01/01/2011
Field of study

We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls granularity, i.e. the tradeoff between cost per iteration and total number of iterations. Furthermore, the total computational cost is directly proportional to the complexity (i.e. rank) of the solution. We study numerical performance on some large-scale problems arising in statistical learning.Comment: Final version, to appear in Stochastic System

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Directory of Open Access Journals

Approximating leading singular triplets of a matrix function

Author: Gaaf Sarah W.
Simoncini Valeria
Publication venue
Publication date: 13/05/2015
Field of study

Given a large square matrix

A

and a sufficiently regular function

f

so that

f(A)

is well defined, we are interested in the approximation of the leading singular values and corresponding singular vectors of

f(A)

, and in particular of

\|f(A)\|

, where

\|\cdot \|

is the matrix norm induced by the Euclidean vector norm. Since neither

f(A)

nor

f(A)v

can be computed exactly, we introduce and analyze an inexact Golub-Kahan-Lanczos bidiagonalization procedure, where the inexactness is related to the inaccuracy of the operations

f(A)v

f(A)^*v

. Particular outer and inner stopping criteria are devised so as to cope with the lack of a true residual. Numerical experiments with the new algorithm on typical application problems are reported

arXiv.org e-Print Archive

Subspace Iteration Randomization and Singular Value Problems

Author: Gu Ming
Publication venue
Publication date: 10/08/2014
Field of study

A classical problem in matrix computations is the efficient and reliable approximation of a given matrix by a matrix of lower rank. The truncated singular value decomposition (SVD) is known to provide the best such approximation for any given fixed rank. However, the SVD is also known to be very costly to compute. Among the different approaches in the literature for computing low-rank approximations, randomized algorithms have attracted researchers' recent attention due to their surprising reliability and computational efficiency in different application areas. Typically, such algorithms are shown to compute with very high probability low-rank approximations that are within a constant factor from optimal, and are known to perform even better in many practical situations. In this paper, we present a novel error analysis that considers randomized algorithms within the subspace iteration framework and show with very high probability that highly accurate low-rank approximations as well as singular values can indeed be computed quickly for matrices with rapidly decaying singular values. Such matrices appear frequently in diverse application areas such as data analysis, fast structured matrix computations and fast direct methods for large sparse linear systems of equations and are the driving motivation for randomized methods. Furthermore, we show that the low-rank approximations computed by these randomized algorithms are actually rank-revealing approximations, and the special case of a rank-1 approximation can also be used to correctly estimate matrix 2-norms with very high probability. Our numerical experiments are in full support of our conclusions.Comment: 45 pages, 5 figure

arXiv.org e-Print Archive

Fast and Simple PCA via Convex Optimization

Author: Garber Dan
Hazan Elad
Publication venue
Publication date: 25/11/2015
Field of study

The problem of principle component analysis (PCA) is traditionally solved by spectral or algebraic methods. We show how computing the leading principal component could be reduced to solving a \textit{small} number of well-conditioned {\it convex} optimization problems. This gives rise to a new efficient method for PCA based on recent advances in stochastic methods for convex optimization. In particular we show that given a

d\times d

matrix \X = \frac{1}{n}\sum_{i=1}^n\x_i\x_i^{\top} with top eigenvector \u and top eigenvalue

\lambda_1

it is possible to: \begin{itemize} \item compute a unit vector \w such that (\w^{\top}\u)^2 \geq 1-\epsilon in

\tilde{O}\left({\frac{d}{\delta^2}+N}\right)

time, where

\delta = \lambda_1 - \lambda_2

and

N

is the total number of non-zero entries in \x_1,...,\x_n, \item compute a unit vector \w such that \w^{\top}\X\w \geq \lambda_1-\epsilon in

\tilde{O}(d/\epsilon^2)

time. \end{itemize} To the best of our knowledge, these bounds are the fastest to date for a wide regime of parameters. These results could be further accelerated when

\delta

(in the first case) and

\epsilon

(in the second case) are smaller than

\sqrt{d/N}

arXiv.org e-Print Archive

Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

Author: Jin Chi
Kakade Sham M.
Musco Cameron
Netrapalli Praneeth
Sidford Aaron
Publication venue
Publication date: 29/05/2016
Field of study

We provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix. Offline Setting: Given an

n \times d

matrix

A

, we show how to compute an

\epsilon

approximate top eigenvector in time

\tilde O ( [nnz(A) + \frac{d \cdot sr(A)}{gap^2}]\cdot \log 1/\epsilon )

and

\tilde O([\frac{nnz(A)^{3/4} (d \cdot sr(A))^{1/4}}{\sqrt{gap}}]\cdot \log1/\epsilon )

. Here

sr(A)

is the stable rank and

gap

is the multiplicative eigenvalue gap. By separating the

gap

dependence from

nnz(A)

we improve on the classic power and Lanczos methods. We also improve prior work using fast subspace embeddings and stochastic optimization, giving significantly improved dependencies on

sr(A)

and

\epsilon

. Our second running time improves this further when

nnz(A) \le \frac{d\cdot sr(A)}{gap^2}

. Online Setting: Given a distribution

D

with covariance matrix

\Sigma

and a vector

x_0

which is an

O(gap)

approximate top eigenvector for

\Sigma

, we show how to refine to an

\epsilon

approximation using

\tilde O(\frac{v(D)}{gap^2} + \frac{v(D)}{gap \cdot \epsilon})

samples from

D

. Here

v(D)

is a natural variance measure. Combining our algorithm with previous work to initialize

x_0

, we obtain a number of improved sample complexity and runtime results. For general distributions, we achieve asymptotically optimal accuracy as a function of sample size as the number of samples grows large. Our results center around a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast SVRG based approximate system solvers to achieve our claims. We believe our results suggest the general effectiveness of shift-and-invert based approaches and imply that further computational gains may be reaped in practice.Comment: Manuscript outdated. Updated version at arxiv:1605.0875

arXiv.org e-Print Archive

Perspectives on information-based complexity

Author: Traub J. F.
Woźniakowski Henryk
Publication venue
Publication date: 31/12/1991
Field of study

The authors discuss information-based complexity theory, which is a model of finite-precision computations with real numbers, and its applications to numerical analysis.Comment: 24 pages. Abstract added in migration

arXiv.org e-Print Archive

Implementing regularization implicitly via approximate eigenvector computation

Author: Mahoney Michael W.
Orecchia Lorenzo
Publication venue
Publication date: 01/01/2011
Field of study

Regularization is a powerful technique for extracting useful information from noisy data. Typically, it is implemented by adding some sort of norm constraint to an objective function and then exactly optimizing the modified objective function. This procedure often leads to optimization problems that are computationally more expensive than the original problem, a fact that is clearly problematic if one is interested in large-scale applications. On the other hand, a large body of empirical work has demonstrated that heuristics, and in some cases approximation algorithms, developed to speed up computations sometimes have the side-effect of performing regularization implicitly. Thus, we consider the question: What is the regularized optimization objective that an approximation algorithm is exactly optimizing? We address this question in the context of computing approximations to the smallest nontrivial eigenvector of a graph Laplacian; and we consider three random-walk-based procedures: one based on the heat kernel of the graph, one based on computing the the PageRank vector associated with the graph, and one based on a truncated lazy random walk. In each case, we provide a precise characterization of the manner in which the approximation method can be viewed as implicitly computing the exact solution to a regularized problem. Interestingly, the regularization is not on the usual vector form of the optimization problem, but instead it is on a related semidefinite program.Comment: 11 pages; a few clarification

arXiv.org e-Print Archive

CiteSeerX

Incremental Method for Spectral Clustering of Increasing Orders

Author: Chen Pin-Yu
Hasan Mohammad Al
Hero Alfred O.
Zhang Baichuan
Publication venue
Publication date: 13/08/2016
Field of study

The smallest eigenvalues and the associated eigenvectors (i.e., eigenpairs) of a graph Laplacian matrix have been widely used for spectral clustering and community detection. However, in real-life applications the number of clusters or communities (say,

K

) is generally unknown a-priori. Consequently, the majority of the existing methods either choose

K

heuristically or they repeat the clustering method with different choices of

K

and accept the best clustering result. The first option, more often, yields suboptimal result, while the second option is computationally expensive. In this work, we propose an incremental method for constructing the eigenspectrum of the graph Laplacian matrix. This method leverages the eigenstructure of graph Laplacian matrix to obtain the

K

-th eigenpairs of the Laplacian matrix given a collection of all the

K-1

smallest eigenpairs. Our proposed method adapts the Laplacian matrix such that the batch eigenvalue decomposition problem transforms into an efficient sequential leading eigenpair computation problem. As a practical application, we consider user-guided spectral clustering. Specifically, we demonstrate that users can utilize the proposed incremental method for effective eigenpair computation and determining the desired number of clusters based on multiple clustering metrics.Comment: in KDD workshop on mining and learning graph, 2016 http://www.mlgworkshop.org/2016

arXiv.org e-Print Archive

Faster Eigenvector Computation via Shift-and-Invert Preconditioning

Author: Garber Dan
Hazan Elad
Jin Chi
Kakade Sham M.
Musco Cameron
Netrapalli Praneeth
Sidford Aaron
Publication venue
Publication date: 01/01/2016
Field of study

We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix

\Sigma

-- i.e. computing a unit vector

x

such that

x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)

: Offline Eigenvector Estimation: Given an explicit

A \in \mathbb{R}^{n \times d}

with

\Sigma = A^TA

, we show how to compute an

\epsilon

approximate top eigenvector in time

\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon )

and

\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )

. Here

nnz(A)

is the number of nonzeros in

A

sr(A)

is the stable rank,

gap

is the relative eigengap. By separating the

gap

dependence from the

nnz(A)

term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on

sr(A)

and

\epsilon

. Our second running time improves these further when

nnz(A) \le \frac{d*sr(A)}{gap^2}

. Online Eigenvector Estimation: Given a distribution

D

with covariance matrix

\Sigma

and a vector

x_0

which is an

O(gap)

approximate top eigenvector for

\Sigma

, we show how to refine to an

\epsilon

approximation using

O(\frac{var(D)}{gap*\epsilon})

samples from

D

. Here

var(D)

is a natural notion of variance. Combining our algorithm with previous work to initialize

x_0

, we obtain improved sample complexity and runtime results under a variety of assumptions on

D

. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.Comment: Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.0889

arXiv.org e-Print Archive