199 research outputs found
Accelerated Stochastic Quasi-Newton Optimization on Riemann Manifolds
We propose an L-BFGS optimization algorithm on Riemannian manifolds using
minibatched stochastic variance reduction techniques for fast convergence with
constant step sizes, without resorting to linesearch methods designed to
satisfy Wolfe conditions. We provide a new convergence proof for strongly
convex functions without using curvature conditions on the manifold, as well as
a convergence discussion for nonconvex functions. We discuss a couple of ways
to obtain the correction pairs used to calculate the product of the gradient
with the inverse Hessian, and empirically demonstrate their use in synthetic
experiments on computation of Karcher means for symmetric positive definite
matrices and leading eigenvalues of large scale data matrices. We compare our
method to VR-PCA for the latter experiment, along with Riemannian SVRG for both
cases, and show strong convergence results for a range of datasets
Riemannian optimization on tensor products of Grassmann manifolds: Applications to generalized Rayleigh-quotients
We introduce a generalized Rayleigh-quotient on the tensor product of
Grassmannians enabling a unified approach to well-known optimization tasks from
different areas of numerical linear algebra, such as best low-rank
approximations of tensors (data compression), geometric measures of
entanglement (quantum computing) and subspace clustering (image processing). We
briefly discuss the geometry of the constraint set, we compute the Riemannian
gradient of the generalized Rayleigh-quotient, we characterize its critical
points and prove that they are generically non-degenerated. Moreover, we derive
an explicit necessary condition for the non-degeneracy of the Hessian. Finally,
we present two intrinsic methods for optimizing the generalized
Rayleigh-quotient - a Newton-like and a conjugated gradient - and compare our
algorithms tailored to the above-mentioned applications with established ones
from the literature.Comment: 29 pages, 8 figures, submitte
Stochastic Quasi-Newton Langevin Monte Carlo
Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods have
been proposed for scaling up Monte Carlo computations to large data problems.
Whilst these approaches have proven useful in many applications, vanilla
SG-MCMC might suffer from poor mixing rates when random variables exhibit
strong couplings under the target densities or big scale differences. In this
study, we propose a novel SG-MCMC method that takes the local geometry into
account by using ideas from Quasi-Newton optimization methods. These second
order methods directly approximate the inverse Hessian by using a limited
history of samples and their gradients. Our method uses dense approximations of
the inverse Hessian while keeping the time and memory complexities linear with
the dimension of the problem. We provide a formal theoretical analysis where we
show that the proposed method is asymptotically unbiased and consistent with
the posterior expectations. We illustrate the effectiveness of the approach on
both synthetic and real datasets. Our experiments on two challenging
applications show that our method achieves fast convergence rates similar to
Riemannian approaches while at the same time having low computational
requirements similar to diagonal preconditioning approaches.Comment: Published in ICML 2016, International Conference on Machine Learning
2016, New York, NY, US
Conic geometric optimisation on the manifold of positive definite matrices
We develop \emph{geometric optimisation} on the manifold of Hermitian
positive definite (HPD) matrices. In particular, we consider optimising two
types of cost functions: (i) geodesically convex (g-convex); and (ii)
log-nonexpansive (LN). G-convex functions are nonconvex in the usual euclidean
sense, but convex along the manifold and thus allow global optimisation. LN
functions may fail to be even g-convex, but still remain globally optimisable
due to their special structure. We develop theoretical tools to recognise and
generate g-convex functions as well as cone theoretic fixed-point optimisation
algorithms. We illustrate our techniques by applying them to maximum-likelihood
parameter estimation for elliptically contoured distributions (a rich class
that substantially generalises the multivariate normal distribution). We
compare our fixed-point algorithms with sophisticated manifold optimisation
methods and obtain notable speedups.Comment: 27 pages; updated version with simplified presentation; 7 figure
Quasi-Newton methods on Grassmannians and multilinear approximations of tensors
In this paper we proposed quasi-Newton and limited memory quasi-Newton
methods for objective functions defined on Grassmannians or a product of
Grassmannians. Specifically we defined BFGS and L-BFGS updates in local and
global coordinates on Grassmannians or a product of these. We proved that, when
local coordinates are used, our BFGS updates on Grassmannians share the same
optimality property as the usual BFGS updates on Euclidean spaces. When applied
to the best multilinear rank approximation problem for general and symmetric
tensors, our approach yields fast, robust, and accurate algorithms that exploit
the special Grassmannian structure of the respective problems, and which work
on tensors of large dimensions and arbitrarily high order. Extensive numerical
experiments are included to substantiate our claims.Comment: 42 pages; 11 figure
Suitable Spaces for Shape Optimization
The differential-geometric structure of certain shape spaces is investigated
and applied to the theory of shape optimization problems constrained by partial
differential equations and variational inequalities. Furthermore, we define a
diffeological structure on a new space of so-called -shapes. This can
be seen as a first step towards the formulation of optimization techniques on
diffeological spaces. The -shapes are a generalization of smooth
shapes and arise naturally in shape optimization problems
Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds
We study optimization of finite sums of geodesically smooth functions on
Riemannian manifolds. Although variance reduction techniques for optimizing
finite-sums have witnessed tremendous attention in the recent years, existing
work is limited to vector space problems. We introduce Riemannian SVRG (RSVRG),
a new variance reduced Riemannian optimization method. We analyze RSVRG for
both geodesically convex and nonconvex (smooth) functions. Our analysis reveals
that RSVRG inherits advantages of the usual SVRG method, but with factors
depending on curvature of the manifold that influence its convergence. To our
knowledge, RSVRG is the first provably fast stochastic Riemannian method.
Moreover, our paper presents the first non-asymptotic complexity analysis
(novel even for the batch setting) for nonconvex Riemannian optimization. Our
results have several implications; for instance, they offer a Riemannian
perspective on variance reduced PCA, which promises a short, transparent
convergence analysis.Comment: This is the final version that appeared in NIPS 2016. Our proof of
Lemma 2 was incorrect in the previous arXiv version. (9 pages paper + 6 pages
appendix
A Globally and Quadratically Convergent Algorithm with Efficient Implementation for Unconstrained Optimization
In this paper, an efficient modified Newton type algorithm is proposed for
nonlinear unconstrianed optimization problems. The modified Hessian is a convex
combination of the identity matrix (for steepest descent algorithm) and the
Hessian matrix (for Newton algorithm). The coefficients of the convex
combination are dynamically chosen in every iteration. The algorithm is proved
to be globally and quadratically convergent for (convex and nonconvex)
nonlinear functions. Efficient implementation is described. Numerical test on
widely used CUTE test problems is conducted for the new algorithm. The test
results are compared with those obtained by MATLAB optimization toolbox
function {\tt fminunc}. The test results are also compared with those obtained
by some established and state-of-the-art algorithms, such as a limited memory
BFGS, a descent and conjugate gradient algorithm, and a limited memory and
descent conjugate gradient algorithm. The comparisons show that the new
algorithm is promising
Efficient PDE constrained shape optimization based on Steklov-Poincar\'e type metrics
Recent progress in PDE constrained optimization on shape manifolds is based
on the Hadamard form of shape derivatives, i.e., in the form of integrals at
the boundary of the shape under investigation, as well as on intrinsic shape
metrics. From a numerical point of view, domain integral forms of shape
derivatives seem promising, which rather require an outer metric on the domain
surrounding the shape boundary. This paper tries to harmonize both points of
view by employing a Steklov-Poincar\'e type intrinsic metric, which is derived
from an outer metric. Based on this metric, efficient shape optimization
algorithms are proposed, which also reduce the analytical labor, so far
involved in the derivation of shape derivatives
Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization
Recent studies have illustrated that stochastic gradient Markov Chain Monte
Carlo techniques have a strong potential in non-convex optimization, where
local and global convergence guarantees can be shown under certain conditions.
By building up on this recent theory, in this study, we develop an
asynchronous-parallel stochastic L-BFGS algorithm for non-convex optimization.
The proposed algorithm is suitable for both distributed and shared-memory
settings. We provide formal theoretical analysis and show that the proposed
method achieves an ergodic convergence rate of (
being the total number of iterations) and it can achieve a linear speedup under
certain conditions. We perform several experiments on both synthetic and real
datasets. The results support our theory and show that the proposed algorithm
provides a significant speedup over the recently proposed synchronous
distributed L-BFGS algorithm.Comment: Published in the International Conference on Machine Learning (ICML
2018
- …