184 research outputs found

    Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometry

    Full text link
    We generalize quasi-arithmetic means beyond scalars by considering the gradient map of a Legendre type real-valued function. The gradient map of a Legendre type function is proven strictly comonotone with a global inverse. It thus yields a generalization of strictly mononotone and differentiable functions generating scalar quasi-arithmetic means. Furthermore, the Legendre transformation gives rise to pairs of dual quasi-arithmetic averages via the convex duality. We study the invariance and equivariance properties under affine transformations of quasi-arithmetic averages via the lens of dually flat spaces of information geometry. We show how these quasi-arithmetic averages are used to express points on dual geodesics and sided barycenters in the dual affine coordinate systems. We then consider quasi-arithmetic mixtures and describe several parametric and non-parametric statistical models which are closed under the quasi-arithmetic mixture operation.Comment: 20 page

    Adaptive Preconditioned Gradient Descent with Energy

    Full text link
    We propose an adaptive time step with energy for a large class of preconditioned gradient descent methods, mainly applied to constrained optimization problems. Our strategy relies on representing the usual descent direction by the product of an energy variable and a transformed gradient, with a preconditioning matrix, for example, to reflect the natural gradient induced by the underlying metric in parameter space or to endow a projection operator when linear equality constraints are present. We present theoretical results on both unconditional stability and convergence rates for three respective classes of objective functions. In addition, our numerical results shed light on the excellent performance of the proposed method on several benchmark optimization problems.Comment: 32 pages, 3 figure

    A numerical approximation method for the Fisher-Rao distance between multivariate normal distributions

    Full text link
    We present a simple method to approximate Rao's distance between multivariate normal distributions based on discretizing curves joining normal distributions and approximating Rao's distances between successive nearby normal distributions on the curves by the square root of Jeffreys divergence, the symmetrized Kullback-Leibler divergence. We consider experimentally the linear interpolation curves in the ordinary, natural and expectation parameterizations of the normal distributions, and compare these curves with a curve derived from the Calvo and Oller's isometric embedding of the Fisher-Rao dd-variate normal manifold into the cone of (d+1)×(d+1)(d+1)\times (d+1) symmetric positive-definite matrices [Journal of multivariate analysis 35.2 (1990): 223-242]. We report on our experiments and assess the quality of our approximation technique by comparing the numerical approximations with both lower and upper bounds. Finally, we present several information-geometric properties of the Calvo and Oller's isometric embedding.Comment: 46 pages, 19 figures, 3 table

    New Directions for Contact Integrators

    Get PDF
    Contact integrators are a family of geometric numerical schemes which guarantee the conservation of the contact structure. In this work we review the construction of both the variational and Hamiltonian versions of these methods. We illustrate some of the advantages of geometric integration in the dissipative setting by focusing on models inspired by recent studies in celestial mechanics and cosmology.Comment: To appear as Chapter 24 in GSI 2021, Springer LNCS 1282

    Parameterized Wasserstein Hamiltonian Flow

    Full text link
    In this work, we propose a numerical method to compute the Wasserstein Hamiltonian flow (WHF), which is a Hamiltonian system on the probability density manifold. Many well-known PDE systems can be reformulated as WHFs. We use parameterized function as push-forward map to characterize the solution of WHF, and convert the PDE to a finite-dimensional ODE system, which is a Hamiltonian system in the phase space of the parameter manifold. We establish error analysis results for the continuous time approximation scheme in Wasserstein metric. For the numerical implementation, we use neural networks as push-forward maps. We apply an effective symplectic scheme to solve the derived Hamiltonian ODE system so that the method preserves some important quantities such as total energy. The computation is done by fully deterministic symplectic integrator without any neural network training. Thus, our method does not involve direct optimization over network parameters and hence can avoid the error introduced by stochastic gradient descent (SGD) methods, which is usually hard to quantify and measure. The proposed algorithm is a sampling-based approach that scales well to higher dimensional problems. In addition, the method also provides an alternative connection between the Lagrangian and Eulerian perspectives of the original WHF through the parameterized ODE dynamics.Comment: We welcome any comments and suggestion

    On the convergence of orthogonalization-free conjugate gradient method for extreme eigenvalues of Hermitian matrices: a Riemannian optimization interpretation

    Full text link
    In many applications, it is desired to obtain extreme eigenvalues and eigenvectors of large Hermitian matrices by efficient and compact algorithms. In particular, orthogonalization-free methods are preferred for large-scale problems for finding eigenspaces of extreme eigenvalues without explicitly computing orthogonal vectors in each iteration. For the top pp eigenvalues, the simplest orthogonalization-free method is to find the best rank-pp approximation to a positive semi-definite Hermitian matrix by algorithms solving the unconstrained Burer-Monteiro formulation. We show that the nonlinear conjugate gradient method for the unconstrained Burer-Monteiro formulation is equivalent to a Riemannian conjugate gradient method on a quotient manifold with the Bures-Wasserstein metric, thus its global convergence to a stationary point can be proven. Numerical tests suggest that it is efficient for computing the largest kk eigenvalues for large-scale matrices if the largest kk eigenvalues are nearly distributed uniformly
    corecore