214 research outputs found

    On some entropy functionals derived from R\'enyi information divergence

    Get PDF
    We consider the maximum entropy problems associated with R\'enyi QQ-entropy, subject to two kinds of constraints on expected values. The constraints considered are a constraint on the standard expectation, and a constraint on the generalized expectation as encountered in nonextensive statistics. The optimum maximum entropy probability distributions, which can exhibit a power-law behaviour, are derived and characterized. The R\'enyi entropy of the optimum distributions can be viewed as a function of the constraint. This defines two families of entropy functionals in the space of possible expected values. General properties of these functionals, including nonnegativity, minimum, convexity, are documented. Their relationships as well as numerical aspects are also discussed. Finally, we work out some specific cases for the reference measure Q(x)Q(x) and recover in a limit case some well-known entropies

    Direct Estimation of Information Divergence Using Nearest Neighbor Ratios

    Full text link
    We propose a direct estimation method for R\'{e}nyi and f-divergence measures based on a new graph theoretical interpretation. Suppose that we are given two sample sets XX and YY, respectively with NN and MM samples, where η:=M/N\eta:=M/N is a constant value. Considering the kk-nearest neighbor (kk-NN) graph of YY in the joint data set (X,Y)(X,Y), we show that the average powered ratio of the number of XX points to the number of YY points among all kk-NN points is proportional to R\'{e}nyi divergence of XX and YY densities. A similar method can also be used to estimate f-divergence measures. We derive bias and variance rates, and show that for the class of γ\gamma-H\"{o}lder smooth functions, the estimator achieves the MSE rate of O(N2γ/(γ+d))O(N^{-2\gamma/(\gamma+d)}). Furthermore, by using a weighted ensemble estimation technique, for density functions with continuous and bounded derivatives of up to the order dd, and some extra conditions at the support set boundary, we derive an ensemble estimator that achieves the parametric MSE rate of O(1/N)O(1/N). Our estimators are more computationally tractable than other competing estimators, which makes them appealing in many practical applications.Comment: 2017 IEEE International Symposium on Information Theory (ISIT

    Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators

    Full text link
    We provide finite-sample analysis of a general framework for using k-nearest neighbor statistics to estimate functionals of a nonparametric continuous probability density, including entropies and divergences. Rather than plugging a consistent density estimate (which requires kk \to \infty as the sample size nn \to \infty) into the functional of interest, the estimators we consider fix k and perform a bias correction. This is more efficient computationally, and, as we show in certain cases, statistically, leading to faster convergence rates. Our framework unifies several previous estimators, for most of which ours are the first finite sample guarantees.Comment: 16 pages, 0 figure

    Information Theoretic Structure Learning with Confidence

    Full text link
    Information theoretic measures (e.g. the Kullback Liebler divergence and Shannon mutual information) have been used for exploring possibly nonlinear multivariate dependencies in high dimension. If these dependencies are assumed to follow a Markov factor graph model, this exploration process is called structure discovery. For discrete-valued samples, estimates of the information divergence over the parametric class of multinomial models lead to structure discovery methods whose mean squared error achieves parametric convergence rates as the sample size grows. However, a naive application of this method to continuous nonparametric multivariate models converges much more slowly. In this paper we introduce a new method for nonparametric structure discovery that uses weighted ensemble divergence estimators that achieve parametric convergence rates and obey an asymptotic central limit theorem that facilitates hypothesis testing and other types of statistical validation.Comment: 10 pages, 3 figure

    Ensemble estimation of multivariate f-divergence

    Full text link
    f-divergence estimation is an important problem in the fields of information theory, machine learning, and statistics. While several divergence estimators exist, relatively few of their convergence rates are known. We derive the MSE convergence rate for a density plug-in estimator of f-divergence. Then by applying the theory of optimally weighted ensemble estimation, we derive a divergence estimator with a convergence rate of O(1/T) that is simple to implement and performs well in high dimensions. We validate our theoretical results with experiments.Comment: 14 pages, 6 figures, a condensed version of this paper was accepted to ISIT 2014, Version 2: Moved the proofs of the theorems from the main body to appendices at the en

    A simple probabilistic construction yielding generalized entropies and divergences, escort distributions and q-Gaussians

    Get PDF
    We give a simple probabilistic description of a transition between two states which leads to a generalized escort distribution. When the parameter of the distribution varies, it defines a parametric curve that we call an escort-path. The R\'enyi divergence appears as a natural by-product of the setting. We study the dynamics of the Fisher information on this path, and show in particular that the thermodynamic divergence is proportional to Jeffreys' divergence. Next, we consider the problem of inferring a distribution on the escort-path, subject to generalized moments constraints. We show that our setting naturally induces a rationale for the minimization of the R\'enyi information divergence. Then, we derive the optimum distribution as a generalized q-Gaussian distribution

    Ensemble Estimation of Information Divergence

    Get PDF
    Recent work has focused on the problem of nonparametric estimation of information divergence functionals between two continuous random variables. Many existing approaches require either restrictive assumptions about the density support set or difficult calculations at the support set boundary which must be known a priori. The mean squared error (MSE) convergence rate of a leave-one-out kernel density plug-in divergence functional estimator for general bounded density support sets is derived where knowledge of the support boundary, and therefore, the boundary correction is not required. The theory of optimally weighted ensemble estimation is generalized to derive a divergence estimator that achieves the parametric rate when the densities are sufficiently smooth. Guidelines for the tuning parameter selection and the asymptotic distribution of this estimator are provided. Based on the theory, an empirical estimator of Rényi-α divergence is proposed that greatly outperforms the standard kernel density plug-in estimator in terms of mean squared error, especially in high dimensions. The estimator is shown to be robust to the choice of tuning parameters. We show extensive simulation results that verify the theoretical results of our paper. Finally, we apply the proposed estimator to estimate the bounds on the Bayes error rate of a cell classification problem

    A family of generalized quantum entropies: definition and properties

    Get PDF
    We present a quantum version of the generalized (h, φ)-entropies, introduced by Salicrú et al. for the study of classical probability distributions.We establish their basic properties and show that already known quantum entropies such as von Neumann, and quantum versions of Rényi, Tsallis, and unified entropies, constitute particular classes of the present general quantum Salicrú form. We exhibit that majorization plays a key role in explaining most of their common features. We give a characterization of the quantum (h, φ)-entropies under the action of quantum operations and study their properties for composite systems. We apply these generalized entropies to the problem of detection of quantum entanglement and introduce a discussion on possible generalized conditional entropies as well.Facultad de Ciencias ExactasInstituto de Física La Plat

    Postquantum Br\`{e}gman relative entropies and nonlinear resource theories

    Full text link
    We introduce the family of postquantum Br\`{e}gman relative entropies, based on nonlinear embeddings into reflexive Banach spaces (with examples given by reflexive noncommutative Orlicz spaces over semi-finite W*-algebras, nonassociative Lp_p spaces over semi-finite JBW-algebras, and noncommutative Lp_p spaces over arbitrary W*-algebras). This allows us to define a class of geometric categories for nonlinear postquantum inference theory (providing an extension of Chencov's approach to foundations of statistical inference), with constrained maximisations of Br\`{e}gman relative entropies as morphisms and nonlinear images of closed convex sets as objects. Further generalisation to a framework for nonlinear convex operational theories is developed using a larger class of morphisms, determined by Br\`{e}gman nonexpansive operations (which provide a well-behaved family of Mielnik's nonlinear transmitters). As an application, we derive a range of nonlinear postquantum resource theories determined in terms of this class of operations.Comment: v2: several corrections and improvements, including an extension to the postquantum (generally) and JBW-algebraic (specifically) cases, a section on nonlinear resource theories, and more informative paper's titl
    corecore