18,064 research outputs found
natural gradient flow in the mixture geometry of a discrete exponential family
In this paper, we study Amari's natural gradient flows of real functions defined on the densities belonging to an exponential family on a finite sample space. Our main example is the minimization of the expected value of a real function defined on the sample space. In such a case, the natural gradient flow converges to densities with reduced support that belong to the border of the exponential family. We have suggested in previous works to use the natural gradient evaluated in the mixture geometry. Here, we show that in some cases, the differential equation can be extended to a bigger domain in such a way that the densities at the border of the exponential family are actually internal points in the extended problem. The extension is based on the algebraic concept of an exponential variety. We study in full detail a toy example and obtain positive partial results in the important case of a binary sample space
Cramer-Rao Lower Bound and Information Geometry
This article focuses on an important piece of work of the world renowned
Indian statistician, Calyampudi Radhakrishna Rao. In 1945, C. R. Rao (25 years
old then) published a pathbreaking paper, which had a profound impact on
subsequent statistical research.Comment: To appear in Connected at Infinity II: On the work of Indian
mathematicians (R. Bhatia and C.S. Rajan, Eds.), special volume of Texts and
Readings In Mathematics (TRIM), Hindustan Book Agency, 201
Nonparametric Information Geometry
The differential-geometric structure of the set of positive densities on a
given measure space has raised the interest of many mathematicians after the
discovery by C.R. Rao of the geometric meaning of the Fisher information. Most
of the research is focused on parametric statistical models. In series of
papers by author and coworkers a particular version of the nonparametric case
has been discussed. It consists of a minimalistic structure modeled according
the theory of exponential families: given a reference density other densities
are represented by the centered log likelihood which is an element of an Orlicz
space. This mappings give a system of charts of a Banach manifold. It has been
observed that, while the construction is natural, the practical applicability
is limited by the technical difficulty to deal with such a class of Banach
spaces. It has been suggested recently to replace the exponential function with
other functions with similar behavior but polynomial growth at infinity in
order to obtain more tractable Banach spaces, e.g. Hilbert spaces. We give
first a review of our theory with special emphasis on the specific issues of
the infinite dimensional setting. In a second part we discuss two specific
topics, differential equations and the metric connection. The position of this
line of research with respect to other approaches is briefly discussed.Comment: Submitted for publication in the Proceedings od GSI2013 Aug 28-30
2013 Pari
Exponential families, Kahler geometry and quantum mechanics
Exponential families are a particular class of statistical manifolds which
are particularly important in statistical inference, and which appear very
frequently in statistics. For example, the set of normal distributions, with
mean {\mu} and deviation {\sigma}, form a 2-dimensional exponential family.
In this paper, we show that the tangent bundle of an exponential family is
naturally a Kahler manifold. This simple but crucial observation leads to the
formalism of quantum mechanics in its geometrical form, i.e. based on the
Kahler structure of the complex projective space, but generalizes also to more
general Kahler manifolds, providing a natural geometric framework for the
description of quantum systems. Many questions related to this "statistical
Kahler geometry" are discussed, and a close connection with representation
theory is observed. Examples of physical relevance are treated in details. For
example, it is shown that the spin of a particle can be entirely understood by
means of the usual binomial distribution. This paper centers on the
mathematical foundations of quantum mechanics, and on the question of its
potential generalization through its geometrical formulation
Online Natural Gradient as a Kalman Filter
We cast Amari's natural gradient in statistical learning as a specific case
of Kalman filtering. Namely, applying an extended Kalman filter to estimate a
fixed unknown parameter of a probabilistic model from a series of observations,
is rigorously equivalent to estimating this parameter via an online stochastic
natural gradient descent on the log-likelihood of the observations.
In the i.i.d. case, this relation is a consequence of the "information
filter" phrasing of the extended Kalman filter. In the recurrent (state space,
non-i.i.d.) case, we prove that the joint Kalman filter over states and
parameters is a natural gradient on top of real-time recurrent learning (RTRL),
a classical algorithm to train recurrent models.
This exact algebraic correspondence provides relevant interpretations for
natural gradient hyperparameters such as learning rates or initialization and
regularization of the Fisher information matrix.Comment: 3rd version: expanded intr
- …