1,602 research outputs found
Information geometric complexity of a trivariate Gaussian statistical model
We evaluate the information geometric complexity of entropic motion on
low-dimensional Gaussian statistical manifolds in order to quantify how
difficult is making macroscopic predictions about a systems in the presence of
limited information. Specifically, we observe that the complexity of such
entropic inferences not only depends on the amount of available pieces of
information but also on the manner in which such pieces are correlated.
Finally, we uncover that for certain correlational structures, the
impossibility of reaching the most favorable configuration from an entropic
inference viewpoint, seems to lead to an information geometric analog of the
well-known frustration effect that occurs in statistical physics.Comment: 16 pages, 1 figur
Intrinsic Universal Measurements of Non-linear Embeddings
A basic problem in machine learning is to find a mapping from a low
dimensional latent space to a high dimensional observation space. Equipped with
the representation power of non-linearity, a learner can easily find a mapping
which perfectly fits all the observations. However such a mapping is often not
considered as good as it is not simple enough and over-fits. How to define
simplicity? This paper tries to make such a formal definition of the amount of
information imposed by a non-linear mapping. This definition is based on
information geometry and is independent of observations, nor specific
parametrizations. We prove these basic properties and discuss relationships
with parametric and non-parametric embeddings.Comment: work in progres
Exact heat kernel on a hypersphere and its applications in kernel SVM
Many contemporary statistical learning methods assume a Euclidean feature
space. This paper presents a method for defining similarity based on
hyperspherical geometry and shows that it often improves the performance of
support vector machine compared to other competing similarity measures.
Specifically, the idea of using heat diffusion on a hypersphere to measure
similarity has been previously proposed, demonstrating promising results based
on a heuristic heat kernel obtained from the zeroth order parametrix expansion;
however, how well this heuristic kernel agrees with the exact hyperspherical
heat kernel remains unknown. This paper presents a higher order parametrix
expansion of the heat kernel on a unit hypersphere and discusses several
problems associated with this expansion method. We then compare the heuristic
kernel with an exact form of the heat kernel expressed in terms of a uniformly
and absolutely convergent series in high-dimensional angular momentum
eigenmodes. Being a natural measure of similarity between sample points
dwelling on a hypersphere, the exact kernel often shows superior performance in
kernel SVM classifications applied to text mining, tumor somatic mutation
imputation, and stock market analysis
Optimal projection of observations in a Bayesian setting
Optimal dimensionality reduction methods are proposed for the Bayesian
inference of a Gaussian linear model with additive noise in presence of
overabundant data. Three different optimal projections of the observations are
proposed based on information theory: the projection that minimizes the
Kullback-Leibler divergence between the posterior distributions of the original
and the projected models, the one that minimizes the expected Kullback-Leibler
divergence between the same distributions, and the one that maximizes the
mutual information between the parameter of interest and the projected
observations. The first two optimization problems are formulated as the
determination of an optimal subspace and therefore the solution is computed
using Riemannian optimization algorithms on the Grassmann manifold. Regarding
the maximization of the mutual information, it is shown that there exists an
optimal subspace that minimizes the entropy of the posterior distribution of
the reduced model; a basis of the subspace can be computed as the solution to a
generalized eigenvalue problem; an a priori error estimate on the mutual
information is available for this particular solution; and that the
dimensionality of the subspace to exactly conserve the mutual information
between the input and the output of the models is less than the number of
parameters to be inferred. Numerical applications to linear and nonlinear
models are used to assess the efficiency of the proposed approaches, and to
highlight their advantages compared to standard approaches based on the
principal component analysis of the observations
- …