1,307 research outputs found

    Faster First-Order Methods for Stochastic Non-Convex Optimization on Riemannian Manifolds

    Full text link
    SPIDER (Stochastic Path Integrated Differential EstimatoR) is an efficient gradient estimation technique developed for non-convex stochastic optimization. Although having been shown to attain nearly optimal computational complexity bounds, the SPIDER-type methods are limited to linear metric spaces. In this paper, we introduce the Riemannian SPIDER (R-SPIDER) method as a novel nonlinear-metric extension of SPIDER for efficient non-convex optimization on Riemannian manifolds. We prove that for finite-sum problems with nn components, R-SPIDER converges to an ϵ\epsilon-accuracy stationary point within O(min(n+nϵ2,1ϵ3))\mathcal{O}\big(\min\big(n+\frac{\sqrt{n}}{\epsilon^2},\frac{1}{\epsilon^3}\big)\big) stochastic gradient evaluations, which is sharper in magnitude than the prior Riemannian first-order methods. For online optimization, R-SPIDER is shown to converge with O(1ϵ3)\mathcal{O}\big(\frac{1}{\epsilon^3}\big) complexity which is, to the best of our knowledge, the first non-asymptotic result for online Riemannian optimization. Especially, for gradient dominated functions, we further develop a variant of R-SPIDER and prove its linear convergence rate. Numerical results demonstrate the computational efficiency of the proposed methods

    Riemannian Optimization via Frank-Wolfe Methods

    Full text link
    We study projection-free methods for constrained Riemannian optimization. In particular, we propose the Riemannian Frank-Wolfe (RFW) method. We analyze non-asymptotic convergence rates of RFW to an optimum for (geodesically) convex problems, and to a critical point for nonconvex objectives. We also present a practical setting under which RFW can attain a linear convergence rate. As a concrete example, we specialize Rfw to the manifold of positive definite matrices and apply it to two tasks: (i) computing the matrix geometric mean (Riemannian centroid); and (ii) computing the Bures-Wasserstein barycenter. Both tasks involve geodesically convex interval constraints, for which we show that the Riemannian "linear oracle" required by RFW admits a closed-form solution; this result may be of independent interest. We further specialize RFW to the special orthogonal group and show that here too, the Riemannian "linear oracle" can be solved in closed form. Here, we describe an application to the synchronization of data matrices (Procrustes problem). We complement our theoretical results with an empirical comparison of Rfw against state-of-the-art Riemannian optimization methods and observe that RFW performs competitively on the task of computing Riemannian centroids.Comment: Under Review. Largely revised version, including an extended experimental section and an application to the special orthogonal group and the Procrustes proble
    corecore