Search CORE

665 research outputs found

On the Convergence of (Stochastic) Gradient Descent with Extrapolation for Non-Convex Optimization

Author: Jin Rong
Xu Yi
Yang Sen
Yang Tianbao
Yuan Zhuoning
Publication venue
Publication date: 05/02/2019
Field of study

Extrapolation is a well-known technique for solving convex optimization and variational inequalities and recently attracts some attention for non-convex optimization. Several recent works have empirically shown its success in some machine learning tasks. However, it has not been analyzed for non-convex minimization and there still remains a gap between the theory and the practice. In this paper, we analyze gradient descent and stochastic gradient descent with extrapolation for finding an approximate first-order stationary point in smooth non-convex optimization problems. Our convergence upper bounds show that the algorithms with extrapolation can be accelerated than without extrapolation

arXiv.org e-Print Archive

Crossref

Efficient Statistics, in High Dimensions, from Truncated Samples

Author: Daskalakis Constantinos
Gouleakis Themis
Tzamos Christos
Zampetakis Manolis
Publication venue
Publication date: 11/09/2018
Field of study

We provide an efficient algorithm for the classical problem, going back to Galton, Pearson, and Fisher, of estimating, with arbitrary accuracy the parameters of a multivariate normal distribution from truncated samples. Truncated samples from a

d

-variate normal

{\cal N}(\mathbf{\mu},\mathbf{\Sigma})

means a samples is only revealed if it falls in some subset

S \subseteq \mathbb{R}^d

; otherwise the samples are hidden and their count in proportion to the revealed samples is also hidden. We show that the mean

\mathbf{\mu}

and covariance matrix

\mathbf{\Sigma}

can be estimated with arbitrary accuracy in polynomial-time, as long as we have oracle access to

S

, and

S

has non-trivial measure under the unknown

d

-variate normal distribution. Additionally we show that without oracle access to

S

, any non-trivial estimation is impossible.Comment: to appear at 59th Annual IEEE Symposium on Foundations of Computer Science (FOCS), 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT