2,506 research outputs found

    Revisiting Chernoff Information with Likelihood Ratio Exponential Families

    Full text link
    The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications due to its empirical robustness property found in applications ranging from information fusion to quantum information. From the viewpoint of information theory, the Chernoff information can also be interpreted as a minmax symmetrization of the Kullback--Leibler divergence. In this paper, we first revisit the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families. Second, we show how to (i) solve exactly the Chernoff information between any two univariate Gaussian distributions or get a closed-form formula using symbolic computing, (ii) report a closed-form formula of the Chernoff information of centered Gaussians with scaled covariance matrices and (iii) use a fast numerical scheme to approximate the Chernoff information between any two multivariate Gaussian distributions.Comment: 41 page

    Generalized Bhattacharyya and Chernoff upper bounds on Bayes error using quasi-arithmetic means

    Full text link
    Bayesian classification labels observations based on given prior information, namely class-a priori and class-conditional probabilities. Bayes' risk is the minimum expected classification cost that is achieved by the Bayes' test, the optimal decision rule. When no cost incurs for correct classification and unit cost is charged for misclassification, Bayes' test reduces to the maximum a posteriori decision rule, and Bayes risk simplifies to Bayes' error, the probability of error. Since calculating this probability of error is often intractable, several techniques have been devised to bound it with closed-form formula, introducing thereby measures of similarity and divergence between distributions like the Bhattacharyya coefficient and its associated Bhattacharyya distance. The Bhattacharyya upper bound can further be tightened using the Chernoff information that relies on the notion of best error exponent. In this paper, we first express Bayes' risk using the total variation distance on scaled distributions. We then elucidate and extend the Bhattacharyya and the Chernoff upper bound mechanisms using generalized weighted means. We provide as a byproduct novel notions of statistical divergences and affinity coefficients. We illustrate our technique by deriving new upper bounds for the univariate Cauchy and the multivariate tt-distributions, and show experimentally that those bounds are not too distant to the computationally intractable Bayes' error.Comment: 22 pages, include R code. To appear in Pattern Recognition Letter

    The Burbea-Rao and Bhattacharyya centroids

    Full text link
    We study the centroid with respect to the class of information-theoretic Burbea-Rao divergences that generalize the celebrated Jensen-Shannon divergence by measuring the non-negative Jensen difference induced by a strictly convex and differentiable function. Although those Burbea-Rao divergences are symmetric by construction, they are not metric since they fail to satisfy the triangle inequality. We first explain how a particular symmetrization of Bregman divergences called Jensen-Bregman distances yields exactly those Burbea-Rao divergences. We then proceed by defining skew Burbea-Rao divergences, and show that skew Burbea-Rao divergences amount in limit cases to compute Bregman divergences. We then prove that Burbea-Rao centroids are unique, and can be arbitrarily finely approximated by a generic iterative concave-convex optimization algorithm with guaranteed convergence property. In the second part of the paper, we consider the Bhattacharyya distance that is commonly used to measure overlapping degree of probability distributions. We show that Bhattacharyya distances on members of the same statistical exponential family amount to calculate a Burbea-Rao divergence in disguise. Thus we get an efficient algorithm for computing the Bhattacharyya centroid of a set of parametric distributions belonging to the same exponential families, improving over former specialized methods found in the literature that were limited to univariate or "diagonal" multivariate Gaussians. To illustrate the performance of our Bhattacharyya/Burbea-Rao centroid algorithm, we present experimental performance results for kk-means and hierarchical clustering methods of Gaussian mixture models.Comment: 13 page

    On Hodges and Lehmann's "6/Ï€6/\pi result"

    Full text link
    While the asymptotic relative efficiency (ARE) of Wilcoxon rank-based tests for location and regression with respect to their parametric Student competitors can be arbitrarily large, Hodges and Lehmann (1961) have shown that the ARE of the same Wilcoxon tests with respect to their van der Waerden or normal-score counterparts is bounded from above by 6/π≈1.9106/\pi\approx 1.910. In this paper, we revisit that result, and investigate similar bounds for statistics based on Student scores. We also consider the serial version of this ARE. More precisely, we study the ARE, under various densities, of the Spearman-Wald-Wolfowitz and Kendall rank-based autocorrelations with respect to the van der Waerden or normal-score ones used to test (ARMA) serial dependence alternatives

    Thermodynamic assessment of probability distribution divergencies and Bayesian model comparison

    Full text link
    Within path sampling framework, we show that probability distribution divergences, such as the Chernoff information, can be estimated via thermodynamic integration. The Boltzmann-Gibbs distribution pertaining to different Hamiltonians is implemented to derive tempered transitions along the path, linking the distributions of interest at the endpoints. Under this perspective, a geometric approach is feasible, which prompts intuition and facilitates tuning the error sources. Additionally, there are direct applications in Bayesian model evaluation. Existing marginal likelihood and Bayes factor estimators are reviewed here along with their stepping-stone sampling analogues. New estimators are presented and the use of compound paths is introduced
    • …
    corecore