1,602 research outputs found

    Information geometric complexity of a trivariate Gaussian statistical model

    Full text link
    We evaluate the information geometric complexity of entropic motion on low-dimensional Gaussian statistical manifolds in order to quantify how difficult is making macroscopic predictions about a systems in the presence of limited information. Specifically, we observe that the complexity of such entropic inferences not only depends on the amount of available pieces of information but also on the manner in which such pieces are correlated. Finally, we uncover that for certain correlational structures, the impossibility of reaching the most favorable configuration from an entropic inference viewpoint, seems to lead to an information geometric analog of the well-known frustration effect that occurs in statistical physics.Comment: 16 pages, 1 figur

    Intrinsic Universal Measurements of Non-linear Embeddings

    Full text link
    A basic problem in machine learning is to find a mapping ff from a low dimensional latent space to a high dimensional observation space. Equipped with the representation power of non-linearity, a learner can easily find a mapping which perfectly fits all the observations. However such a mapping is often not considered as good as it is not simple enough and over-fits. How to define simplicity? This paper tries to make such a formal definition of the amount of information imposed by a non-linear mapping. This definition is based on information geometry and is independent of observations, nor specific parametrizations. We prove these basic properties and discuss relationships with parametric and non-parametric embeddings.Comment: work in progres

    Exact heat kernel on a hypersphere and its applications in kernel SVM

    Full text link
    Many contemporary statistical learning methods assume a Euclidean feature space. This paper presents a method for defining similarity based on hyperspherical geometry and shows that it often improves the performance of support vector machine compared to other competing similarity measures. Specifically, the idea of using heat diffusion on a hypersphere to measure similarity has been previously proposed, demonstrating promising results based on a heuristic heat kernel obtained from the zeroth order parametrix expansion; however, how well this heuristic kernel agrees with the exact hyperspherical heat kernel remains unknown. This paper presents a higher order parametrix expansion of the heat kernel on a unit hypersphere and discusses several problems associated with this expansion method. We then compare the heuristic kernel with an exact form of the heat kernel expressed in terms of a uniformly and absolutely convergent series in high-dimensional angular momentum eigenmodes. Being a natural measure of similarity between sample points dwelling on a hypersphere, the exact kernel often shows superior performance in kernel SVM classifications applied to text mining, tumor somatic mutation imputation, and stock market analysis

    Optimal projection of observations in a Bayesian setting

    Full text link
    Optimal dimensionality reduction methods are proposed for the Bayesian inference of a Gaussian linear model with additive noise in presence of overabundant data. Three different optimal projections of the observations are proposed based on information theory: the projection that minimizes the Kullback-Leibler divergence between the posterior distributions of the original and the projected models, the one that minimizes the expected Kullback-Leibler divergence between the same distributions, and the one that maximizes the mutual information between the parameter of interest and the projected observations. The first two optimization problems are formulated as the determination of an optimal subspace and therefore the solution is computed using Riemannian optimization algorithms on the Grassmann manifold. Regarding the maximization of the mutual information, it is shown that there exists an optimal subspace that minimizes the entropy of the posterior distribution of the reduced model; a basis of the subspace can be computed as the solution to a generalized eigenvalue problem; an a priori error estimate on the mutual information is available for this particular solution; and that the dimensionality of the subspace to exactly conserve the mutual information between the input and the output of the models is less than the number of parameters to be inferred. Numerical applications to linear and nonlinear models are used to assess the efficiency of the proposed approaches, and to highlight their advantages compared to standard approaches based on the principal component analysis of the observations
    corecore