Search CORE

3,404 research outputs found

Dimensionality Reduction for Stationary Time Series via Stochastic Nonconvex Optimization

Author: Chen Minshuo
Wang Mengdi
Yang Lin
Zhao Tuo
Publication venue
Publication date: 01/01/2018
Field of study

Stochastic optimization naturally arises in machine learning. Efficient algorithms with provable guarantees, however, are still largely missing, when the objective function is nonconvex and the data points are dependent. This paper studies this fundamental challenge through a streaming PCA problem for stationary time series data. Specifically, our goal is to estimate the principle component of time series data with respect to the covariance matrix of the stationary distribution. Computationally, we propose a variant of Oja's algorithm combined with downsampling to control the bias of the stochastic gradient caused by the data dependency. Theoretically, we quantify the uncertainty of our proposed stochastic algorithm based on diffusion approximations. This allows us to prove the asymptotic rate of convergence and further implies near optimal asymptotic sample complexity. Numerical experiments are provided to support our analysis

arXiv.org e-Print Archive

Princeton University Open Access Repository

Faster Eigenvector Computation via Shift-and-Invert Preconditioning

Author: Garber Dan
Hazan Elad
Jin Chi
Kakade Sham M.
Musco Cameron
Netrapalli Praneeth
Sidford Aaron
Publication venue
Publication date: 01/01/2016
Field of study

We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix

\Sigma

-- i.e. computing a unit vector

x

such that

x^T \Sigma x \ge (1-\epsilon)\lambda_1(\Sigma)

: Offline Eigenvector Estimation: Given an explicit

A \in \mathbb{R}^{n \times d}

with

\Sigma = A^TA

, we show how to compute an

\epsilon

approximate top eigenvector in time

\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/\epsilon )

and

\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/\epsilon )

. Here

nnz(A)

is the number of nonzeros in

A

sr(A)

is the stable rank,

gap

is the relative eigengap. By separating the

gap

dependence from the

nnz(A)

term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on

sr(A)

and

\epsilon

. Our second running time improves these further when

nnz(A) \le \frac{d*sr(A)}{gap^2}

. Online Eigenvector Estimation: Given a distribution

D

with covariance matrix

\Sigma

and a vector

x_0

which is an

O(gap)

approximate top eigenvector for

\Sigma

, we show how to refine to an

\epsilon

approximation using

O(\frac{var(D)}{gap*\epsilon})

samples from

D

. Here

var(D)

is a natural notion of variance. Combining our algorithm with previous work to initialize

x_0

, we obtain improved sample complexity and runtime results under a variety of assumptions on

D

. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.Comment: Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.0889

arXiv.org e-Print Archive

Princeton University Open Access Repository

A trust-region method for stochastic variational inference with applications to streaming data

Author: Hoffman Matthew D.
Theis Lucas
Publication venue
Publication date: 28/05/2015
Field of study

Stochastic variational inference allows for fast posterior inference in complex Bayesian models. However, the algorithm is prone to local optima which can make the quality of the posterior approximation sensitive to the choice of hyperparameters and initialization. We address this problem by replacing the natural gradient step of stochastic varitional inference with a trust-region update. We show that this leads to generally better results and reduced sensitivity to hyperparameters. We also describe a new strategy for variational inference on streaming data and show that here our trust-region method is crucial for getting good performance.Comment: in Proceedings of the 32nd International Conference on Machine Learning, 201

arXiv.org e-Print Archive

MPG.PuRe