199 research outputs found

    Accelerated Stochastic Quasi-Newton Optimization on Riemann Manifolds

    Full text link
    We propose an L-BFGS optimization algorithm on Riemannian manifolds using minibatched stochastic variance reduction techniques for fast convergence with constant step sizes, without resorting to linesearch methods designed to satisfy Wolfe conditions. We provide a new convergence proof for strongly convex functions without using curvature conditions on the manifold, as well as a convergence discussion for nonconvex functions. We discuss a couple of ways to obtain the correction pairs used to calculate the product of the gradient with the inverse Hessian, and empirically demonstrate their use in synthetic experiments on computation of Karcher means for symmetric positive definite matrices and leading eigenvalues of large scale data matrices. We compare our method to VR-PCA for the latter experiment, along with Riemannian SVRG for both cases, and show strong convergence results for a range of datasets

    Riemannian optimization on tensor products of Grassmann manifolds: Applications to generalized Rayleigh-quotients

    Full text link
    We introduce a generalized Rayleigh-quotient on the tensor product of Grassmannians enabling a unified approach to well-known optimization tasks from different areas of numerical linear algebra, such as best low-rank approximations of tensors (data compression), geometric measures of entanglement (quantum computing) and subspace clustering (image processing). We briefly discuss the geometry of the constraint set, we compute the Riemannian gradient of the generalized Rayleigh-quotient, we characterize its critical points and prove that they are generically non-degenerated. Moreover, we derive an explicit necessary condition for the non-degeneracy of the Hessian. Finally, we present two intrinsic methods for optimizing the generalized Rayleigh-quotient - a Newton-like and a conjugated gradient - and compare our algorithms tailored to the above-mentioned applications with established ones from the literature.Comment: 29 pages, 8 figures, submitte

    Stochastic Quasi-Newton Langevin Monte Carlo

    Full text link
    Recently, Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods have been proposed for scaling up Monte Carlo computations to large data problems. Whilst these approaches have proven useful in many applications, vanilla SG-MCMC might suffer from poor mixing rates when random variables exhibit strong couplings under the target densities or big scale differences. In this study, we propose a novel SG-MCMC method that takes the local geometry into account by using ideas from Quasi-Newton optimization methods. These second order methods directly approximate the inverse Hessian by using a limited history of samples and their gradients. Our method uses dense approximations of the inverse Hessian while keeping the time and memory complexities linear with the dimension of the problem. We provide a formal theoretical analysis where we show that the proposed method is asymptotically unbiased and consistent with the posterior expectations. We illustrate the effectiveness of the approach on both synthetic and real datasets. Our experiments on two challenging applications show that our method achieves fast convergence rates similar to Riemannian approaches while at the same time having low computational requirements similar to diagonal preconditioning approaches.Comment: Published in ICML 2016, International Conference on Machine Learning 2016, New York, NY, US

    Conic geometric optimisation on the manifold of positive definite matrices

    Full text link
    We develop \emph{geometric optimisation} on the manifold of Hermitian positive definite (HPD) matrices. In particular, we consider optimising two types of cost functions: (i) geodesically convex (g-convex); and (ii) log-nonexpansive (LN). G-convex functions are nonconvex in the usual euclidean sense, but convex along the manifold and thus allow global optimisation. LN functions may fail to be even g-convex, but still remain globally optimisable due to their special structure. We develop theoretical tools to recognise and generate g-convex functions as well as cone theoretic fixed-point optimisation algorithms. We illustrate our techniques by applying them to maximum-likelihood parameter estimation for elliptically contoured distributions (a rich class that substantially generalises the multivariate normal distribution). We compare our fixed-point algorithms with sophisticated manifold optimisation methods and obtain notable speedups.Comment: 27 pages; updated version with simplified presentation; 7 figure

    Quasi-Newton methods on Grassmannians and multilinear approximations of tensors

    Full text link
    In this paper we proposed quasi-Newton and limited memory quasi-Newton methods for objective functions defined on Grassmannians or a product of Grassmannians. Specifically we defined BFGS and L-BFGS updates in local and global coordinates on Grassmannians or a product of these. We proved that, when local coordinates are used, our BFGS updates on Grassmannians share the same optimality property as the usual BFGS updates on Euclidean spaces. When applied to the best multilinear rank approximation problem for general and symmetric tensors, our approach yields fast, robust, and accurate algorithms that exploit the special Grassmannian structure of the respective problems, and which work on tensors of large dimensions and arbitrarily high order. Extensive numerical experiments are included to substantiate our claims.Comment: 42 pages; 11 figure

    Suitable Spaces for Shape Optimization

    Full text link
    The differential-geometric structure of certain shape spaces is investigated and applied to the theory of shape optimization problems constrained by partial differential equations and variational inequalities. Furthermore, we define a diffeological structure on a new space of so-called H1/2H^{1/2}-shapes. This can be seen as a first step towards the formulation of optimization techniques on diffeological spaces. The H1/2H^{1/2}-shapes are a generalization of smooth shapes and arise naturally in shape optimization problems

    Riemannian SVRG: Fast Stochastic Optimization on Riemannian Manifolds

    Full text link
    We study optimization of finite sums of geodesically smooth functions on Riemannian manifolds. Although variance reduction techniques for optimizing finite-sums have witnessed tremendous attention in the recent years, existing work is limited to vector space problems. We introduce Riemannian SVRG (RSVRG), a new variance reduced Riemannian optimization method. We analyze RSVRG for both geodesically convex and nonconvex (smooth) functions. Our analysis reveals that RSVRG inherits advantages of the usual SVRG method, but with factors depending on curvature of the manifold that influence its convergence. To our knowledge, RSVRG is the first provably fast stochastic Riemannian method. Moreover, our paper presents the first non-asymptotic complexity analysis (novel even for the batch setting) for nonconvex Riemannian optimization. Our results have several implications; for instance, they offer a Riemannian perspective on variance reduced PCA, which promises a short, transparent convergence analysis.Comment: This is the final version that appeared in NIPS 2016. Our proof of Lemma 2 was incorrect in the previous arXiv version. (9 pages paper + 6 pages appendix

    A Globally and Quadratically Convergent Algorithm with Efficient Implementation for Unconstrained Optimization

    Full text link
    In this paper, an efficient modified Newton type algorithm is proposed for nonlinear unconstrianed optimization problems. The modified Hessian is a convex combination of the identity matrix (for steepest descent algorithm) and the Hessian matrix (for Newton algorithm). The coefficients of the convex combination are dynamically chosen in every iteration. The algorithm is proved to be globally and quadratically convergent for (convex and nonconvex) nonlinear functions. Efficient implementation is described. Numerical test on widely used CUTE test problems is conducted for the new algorithm. The test results are compared with those obtained by MATLAB optimization toolbox function {\tt fminunc}. The test results are also compared with those obtained by some established and state-of-the-art algorithms, such as a limited memory BFGS, a descent and conjugate gradient algorithm, and a limited memory and descent conjugate gradient algorithm. The comparisons show that the new algorithm is promising

    Efficient PDE constrained shape optimization based on Steklov-Poincar\'e type metrics

    Full text link
    Recent progress in PDE constrained optimization on shape manifolds is based on the Hadamard form of shape derivatives, i.e., in the form of integrals at the boundary of the shape under investigation, as well as on intrinsic shape metrics. From a numerical point of view, domain integral forms of shape derivatives seem promising, which rather require an outer metric on the domain surrounding the shape boundary. This paper tries to harmonize both points of view by employing a Steklov-Poincar\'e type intrinsic metric, which is derived from an outer metric. Based on this metric, efficient shape optimization algorithms are proposed, which also reduce the analytical labor, so far involved in the derivation of shape derivatives

    Asynchronous Stochastic Quasi-Newton MCMC for Non-Convex Optimization

    Full text link
    Recent studies have illustrated that stochastic gradient Markov Chain Monte Carlo techniques have a strong potential in non-convex optimization, where local and global convergence guarantees can be shown under certain conditions. By building up on this recent theory, in this study, we develop an asynchronous-parallel stochastic L-BFGS algorithm for non-convex optimization. The proposed algorithm is suitable for both distributed and shared-memory settings. We provide formal theoretical analysis and show that the proposed method achieves an ergodic convergence rate of O(1/N){\cal O}(1/\sqrt{N}) (NN being the total number of iterations) and it can achieve a linear speedup under certain conditions. We perform several experiments on both synthetic and real datasets. The results support our theory and show that the proposed algorithm provides a significant speedup over the recently proposed synchronous distributed L-BFGS algorithm.Comment: Published in the International Conference on Machine Learning (ICML 2018
    • …
    corecore