207,447 research outputs found
Extension of information geometry for modelling non-statistical systems
In this dissertation, an abstract formalism extending information geometry is
introduced. This framework encompasses a broad range of modelling problems,
including possible applications in machine learning and in the information
theoretical foundations of quantum theory. Its purely geometrical foundations
make no use of probability theory and very little assumptions about the data or
the models are made. Starting only from a divergence function, a Riemannian
geometrical structure consisting of a metric tensor and an affine connection is
constructed and its properties are investigated. Also the relation to
information geometry and in particular the geometry of exponential families of
probability distributions is elucidated. It turns out this geometrical
framework offers a straightforward way to determine whether or not a
parametrised family of distributions can be written in exponential form. Apart
from the main theoretical chapter, the dissertation also contains a chapter of
examples illustrating the application of the formalism and its geometric
properties, a brief introduction to differential geometry and a historical
overview of the development of information geometry.Comment: PhD thesis, University of Antwerp, Advisors: Prof. dr. Jan Naudts and
Prof. dr. Jacques Tempere, December 2014, 108 page
Optimal measures and Markov transition kernels
We study optimal solutions to an abstract optimization problem for measures, which is a generalization of classical variational problems in information theory and statistical physics. In the classical problems, information and relative entropy are defined using the Kullback-Leibler divergence, and for this reason optimal measures belong to a one-parameter exponential family. Measures within such a family have the property of mutual absolute continuity. Here we show that this property characterizes other families of optimal positive measures if a functional representing information has a strictly convex dual. Mutual absolute continuity of optimal probability measures allows us to strictly separate deterministic and non-deterministic Markov transition kernels, which play an important role in theories of decisions, estimation, control, communication and computation. We show that deterministic transitions are strictly sub-optimal, unless information resource with a strictly convex dual is unconstrained. For illustration, we construct an example where, unlike non-deterministic, any deterministic kernel either has negatively infinite expected utility (unbounded expected error) or communicates infinite information
Optimal measures and Markov transition kernels
We study optimal solutions to an abstract optimization problem for measures,
which is a generalization of classical variational problems in information
theory and statistical physics. In the classical problems, information and
relative entropy are defined using the Kullback-Leibler divergence, and for
this reason optimal measures belong to a one-parameter exponential family.
Measures within such a family have the property of mutual absolute continuity.
Here we show that this property characterizes other families of optimal
positive measures if a functional representing information has a strictly
convex dual. Mutual absolute continuity of optimal probability measures allows
us to strictly separate deterministic and non-deterministic Markov transition
kernels, which play an important role in theories of decisions, estimation,
control, communication and computation. We show that deterministic transitions
are strictly sub-optimal, unless information resource with a strictly convex
dual is unconstrained. For illustration, we construct an example where, unlike
non-deterministic, any deterministic kernel either has negatively infinite
expected utility (unbounded expected error) or communicates infinite
information.Comment: Replaced with a final and accepted draft; Journal of Global
Optimization, Springer, Jan 1, 201
On a generalization of the Jensen-Shannon divergence and the JS-symmetrization of distances relying on abstract means
The Jensen-Shannon divergence is a renown bounded symmetrization of the
unbounded Kullback-Leibler divergence which measures the total Kullback-Leibler
divergence to the average mixture distribution. However the Jensen-Shannon
divergence between Gaussian distributions is not available in closed-form. To
bypass this problem, we present a generalization of the Jensen-Shannon (JS)
divergence using abstract means which yields closed-form expressions when the
mean is chosen according to the parametric family of distributions. More
generally, we define the JS-symmetrizations of any distance using generalized
statistical mixtures derived from abstract means. In particular, we first show
that the geometric mean is well-suited for exponential families, and report two
closed-form formula for (i) the geometric Jensen-Shannon divergence between
probability densities of the same exponential family, and (ii) the geometric
JS-symmetrization of the reverse Kullback-Leibler divergence. As a second
illustrating example, we show that the harmonic mean is well-suited for the
scale Cauchy distributions, and report a closed-form formula for the harmonic
Jensen-Shannon divergence between scale Cauchy distributions. We also define
generalized Jensen-Shannon divergences between matrices (e.g., quantum
Jensen-Shannon divergences) and consider clustering with respect to these novel
Jensen-Shannon divergences.Comment: 30 page
Horizon-Independent Optimal Prediction with Log-Loss in Exponential Families
We study online learning under logarithmic loss with regular parametric
models. Hedayati and Bartlett (2012b) showed that a Bayesian prediction
strategy with Jeffreys prior and sequential normalized maximum likelihood
(SNML) coincide and are optimal if and only if the latter is exchangeable, and
if and only if the optimal strategy can be calculated without knowing the time
horizon in advance. They put forward the question what families have
exchangeable SNML strategies. This paper fully answers this open problem for
one-dimensional exponential families. The exchangeability can happen only for
three classes of natural exponential family distributions, namely the Gaussian,
Gamma, and the Tweedie exponential family of order 3/2. Keywords: SNML
Exchangeability, Exponential Family, Online Learning, Logarithmic Loss,
Bayesian Strategy, Jeffreys Prior, Fisher Information1Comment: 23 page
Feature Extraction for Universal Hypothesis Testing via Rank-constrained Optimization
This paper concerns the construction of tests for universal hypothesis
testing problems, in which the alternate hypothesis is poorly modeled and the
observation space is large. The mismatched universal test is a feature-based
technique for this purpose. In prior work it is shown that its
finite-observation performance can be much better than the (optimal) Hoeffding
test, and good performance depends crucially on the choice of features. The
contributions of this paper include: 1) We obtain bounds on the number of
\epsilon distinguishable distributions in an exponential family. 2) This
motivates a new framework for feature extraction, cast as a rank-constrained
optimization problem. 3) We obtain a gradient-based algorithm to solve the
rank-constrained optimization problem and prove its local convergence.Comment: 5 pages, 4 figures, submitted to ISIT 201
Support Sets in Exponential Families and Oriented Matroid Theory
The closure of a discrete exponential family is described by a finite set of
equations corresponding to the circuits of an underlying oriented matroid.
These equations are similar to the equations used in algebraic statistics,
although they need not be polynomial in the general case. This description
allows for a combinatorial study of the possible support sets in the closure of
an exponential family. If two exponential families induce the same oriented
matroid, then their closures have the same support sets. Furthermore, the
positive cocircuits give a parameterization of the closure of the exponential
family.Comment: 27 pages, extended version published in IJA
- âŚ