252 research outputs found

    Dual Connections in Nonparametric Classical Information Geometry

    Full text link
    We construct an infinite-dimensional information manifold based on exponential Orlicz spaces without using the notion of exponential convergence. We then show that convex mixtures of probability densities lie on the same connected component of this manifold, and characterize the class of densities for which this mixture can be extended to an open segment containing the extreme points. For this class, we define an infinite-dimensional analogue of the mixture parallel transport and prove that it is dual to the exponential parallel transport with respect to the Fisher information. We also define {\alpha}-derivatives and prove that they are convex mixtures of the extremal (\pm 1)-derivatives

    Quantum Statistical Manifolds

    Full text link
    Quantum information geometry studies families of quantum states by means of differential geometry. A new approach is followed with the intention to facilitate the introduction of a more general theory in subsequent work. To this purpose, the emphasis is shifted from a manifold of strictly positive density matrices to a manifold of faithful quantum states on the C*-algebra of bounded linear operators. In addition, ideas from the parameter-free approach to information geometry are adopted. The underlying Hilbert space is assumed to be finite-dimensional. In this way technicalities are avoided so that strong results are obtained, which one can hope to prove later on in a more general context. Two different atlases are introduced, one in which it is straightforward to show that the quantum states form a Banach manifold, the other which is compatible with the inner product of Bogoliubov and which yields affine coordinates for the exponential connection.Comment: submitted to the proceedings of Entropy 201

    A Class of Non-Parametric Statistical Manifolds modelled on Sobolev Space

    Get PDF
    We construct a family of non-parametric (infinite-dimensional) manifolds of finite measures on Rd. The manifolds are modelled on a variety of weighted Sobolev spaces, including Hilbert-Sobolev spaces and mixed-norm spaces. Each supports the Fisher-Rao metric as a weak Riemannian metric. Densities are expressed in terms of a deformed exponential function having linear growth. Unusually for the Sobolev context, and as a consequence of its linear growth, this "lifts" to a nonlinear superposition (Nemytskii) operator that acts continuously on a particular class of mixed-norm model spaces, and on the fixed norm space W²'¹ i.e. it maps each of these spaces continuously into itself. It also maps continuously between other fixed-norm spaces with a loss of Lebesgue exponent that increases with the number of derivatives. Some of the results make essential use of a log-Sobolev embedding theorem. Each manifold contains a smoothly embedded submanifold of probability measures. Applications to the stochastic partial differential equations of nonlinear filtering (and hence to the Fokker-Planck equation) are outlined

    Information geometric measurements of generalisation

    Get PDF
    Neural networks can be regarded as statistical models, and can be analysed in a Bayesian framework. Generalisation is measured by the performance on independent test data drawn from the same distribution as the training data. Such performance can be quantified by the posterior average of the information divergence between the true and the model distributions. Averaging over the Bayesian posterior guarantees internal coherence; Using information divergence guarantees invariance with respect to representation. The theory generalises the least mean squares theory for linear Gaussian models to general problems of statistical estimation. The main results are: (1)~the ideal optimal estimate is always given by average over the posterior; (2)~the optimal estimate within a computational model is given by the projection of the ideal estimate to the model. This incidentally shows some currently popular methods dealing with hyperpriors are in general unnecessary and misleading. The extension of information divergence to positive normalisable measures reveals a remarkable relation between the dlt dual affine geometry of statistical manifolds and the geometry of the dual pair of Banach spaces Ld and Ldd. It therefore offers conceptual simplification to information geometry. The general conclusion on the issue of evaluating neural network learning rules and other statistical inference methods is that such evaluations are only meaningful under three assumptions: The prior P(p), describing the environment of all the problems; the divergence Dd, specifying the requirement of the task; and the model Q, specifying available computing resources

    Manifolds of Differentiable Densities

    Get PDF
    We develop a family of infinite-dimensional (non-parametric) manifolds of probability measures. The latter are defined on underlying Banach spaces, and have densities of class CbkC_b^k with respect to appropriate reference measures. The case k=∞k=\infty, in which the manifolds are modelled on Fréchet spaces, is included. The manifolds admit the Fisher-Rao metric and, unusually for the non-parametric setting, Amari's α\alpha-covariant derivatives for all α∈R\alpha\in\R. By construction, they are C∞C^\infty-embedded submanifolds of particular manifolds of finite measures. The statistical manifolds are dually (α=±1\alpha=\pm 1) flat, and admit mixture and exponential representations as charts. Their curvatures with respect to the α\alpha-covariant derivatives are derived. The likelihood function associated with a finite sample is a continuous function on each of the manifolds, and the α\alpha-divergences are of class C∞C^\infty

    Long-range interactions, doubling measures and Tsallis entropy

    Full text link
    We present a path toward determining the statistical origin of the thermodynamic limit for systems with long-range interactions. We assume throughout that the systems under consideration have thermodynamic properties given by the Tsallis entropy. We rely on the composition property of the Tsallis entropy for determining effective metrics and measures on their configuration/phase spaces. We point out the significance of Muckenhoupt weights, of doubling measures and of doubling measure-induced metric deformations of the metric. We comment on the volume deformations induced by the Tsallis entropy composition and on the significance of functional spaces for these constructions.Comment: 26 pages, No figures, Standard LaTeX. Revised version: addition of a paragraph on a contentious issue (Sect. 3). To be published by Eur. Phys. J.

    Entropies from coarse-graining: convex polytopes vs. ellipsoids

    Full text link
    We examine the Boltzmann/Gibbs/Shannon SBGS\mathcal{S}_{BGS} and the non-additive Havrda-Charv\'{a}t / Dar\'{o}czy/Cressie-Read/Tsallis \ Sq\mathcal{S}_q \ and the Kaniadakis κ\kappa-entropy \ Sκ\mathcal{S}_\kappa \ from the viewpoint of coarse-graining, symplectic capacities and convexity. We argue that the functional form of such entropies can be ascribed to a discordance in phase-space coarse-graining between two generally different approaches: the Euclidean/Riemannian metric one that reflects independence and picks cubes as the fundamental cells and the symplectic/canonical one that picks spheres/ellipsoids for this role. Our discussion is motivated by and confined to the behaviour of Hamiltonian systems of many degrees of freedom. We see that Dvoretzky's theorem provides asymptotic estimates for the minimal dimension beyond which these two approaches are close to each other. We state and speculate about the role that dualities may play in this viewpoint.Comment: 63 pages. No figures. Standard LaTe

    A Geometric Variational Approach to Bayesian Inference

    Get PDF
    We propose a novel Riemannian geometric framework for variational inference in Bayesian models based on the nonparametric Fisher-Rao metric on the manifold of probability density functions. Under the square-root density representation, the manifold can be identified with the positive orthant of the unit hypersphere in L2, and the Fisher-Rao metric reduces to the standard L2 metric. Exploiting such a Riemannian structure, we formulate the task of approximating the posterior distribution as a variational problem on the hypersphere based on the alpha-divergence. This provides a tighter lower bound on the marginal distribution when compared to, and a corresponding upper bound unavailable with, approaches based on the Kullback-Leibler divergence. We propose a novel gradient-based algorithm for the variational problem based on Frechet derivative operators motivated by the geometry of the Hilbert sphere, and examine its properties. Through simulations and real-data applications, we demonstrate the utility of the proposed geometric framework and algorithm on several Bayesian models

    Quasi-Arithmetic Mixtures, Divergence Minimization, and Bregman Information

    Full text link
    Markov Chain Monte Carlo methods for sampling from complex distributions and estimating normalization constants often simulate samples from a sequence of intermediate distributions along an annealing path, which bridges between a tractable initial distribution and a target density of interest. Prior work has constructed annealing paths using quasi-arithmetic means, and interpreted the resulting intermediate densities as minimizing an expected divergence to the endpoints. We provide a comprehensive analysis of this 'centroid' property using Bregman divergences under a monotonic embedding of the density function, thereby associating common divergences such as Amari's and Renyi's α{\alpha}-divergences, (α,β){(\alpha,\beta)}-divergences, and the Jensen-Shannon divergence with intermediate densities along an annealing path. Our analysis highlights the interplay between parametric families, quasi-arithmetic means, and divergence functions using the rho-tau Bregman divergence framework of Zhang 2004,2013.Comment: 19 pages + appendix (rewritten + changed title in revision
    • …
    corecore