245 research outputs found

    The Burbea-Rao and Bhattacharyya centroids

    Full text link
    We study the centroid with respect to the class of information-theoretic Burbea-Rao divergences that generalize the celebrated Jensen-Shannon divergence by measuring the non-negative Jensen difference induced by a strictly convex and differentiable function. Although those Burbea-Rao divergences are symmetric by construction, they are not metric since they fail to satisfy the triangle inequality. We first explain how a particular symmetrization of Bregman divergences called Jensen-Bregman distances yields exactly those Burbea-Rao divergences. We then proceed by defining skew Burbea-Rao divergences, and show that skew Burbea-Rao divergences amount in limit cases to compute Bregman divergences. We then prove that Burbea-Rao centroids are unique, and can be arbitrarily finely approximated by a generic iterative concave-convex optimization algorithm with guaranteed convergence property. In the second part of the paper, we consider the Bhattacharyya distance that is commonly used to measure overlapping degree of probability distributions. We show that Bhattacharyya distances on members of the same statistical exponential family amount to calculate a Burbea-Rao divergence in disguise. Thus we get an efficient algorithm for computing the Bhattacharyya centroid of a set of parametric distributions belonging to the same exponential families, improving over former specialized methods found in the literature that were limited to univariate or "diagonal" multivariate Gaussians. To illustrate the performance of our Bhattacharyya/Burbea-Rao centroid algorithm, we present experimental performance results for kk-means and hierarchical clustering methods of Gaussian mixture models.Comment: 13 page

    Centroid-Based Clustering with ab-Divergences

    Get PDF
    Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of ab-divergences, which is governed by two parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applications.MINECO TEC2017-82807-

    Cramer-Rao Lower Bound and Information Geometry

    Full text link
    This article focuses on an important piece of work of the world renowned Indian statistician, Calyampudi Radhakrishna Rao. In 1945, C. R. Rao (25 years old then) published a pathbreaking paper, which had a profound impact on subsequent statistical research.Comment: To appear in Connected at Infinity II: On the work of Indian mathematicians (R. Bhatia and C.S. Rajan, Eds.), special volume of Texts and Readings In Mathematics (TRIM), Hindustan Book Agency, 201

    Proximity Operators of Discrete Information Divergences

    Get PDF
    Information divergences allow one to assess how close two distributions are from each other. Among the large panel of available measures, a special attention has been paid to convex φ\varphi-divergences, such as Kullback-Leibler, Jeffreys-Kullback, Hellinger, Chi-Square, Renyi, and Iα_{\alpha} divergences. While φ\varphi-divergences have been extensively studied in convex analysis, their use in optimization problems often remains challenging. In this regard, one of the main shortcomings of existing methods is that the minimization of φ\varphi-divergences is usually performed with respect to one of their arguments, possibly within alternating optimization techniques. In this paper, we overcome this limitation by deriving new closed-form expressions for the proximity operator of such two-variable functions. This makes it possible to employ standard proximal methods for efficiently solving a wide range of convex optimization problems involving φ\varphi-divergences. In addition, we show that these proximity operators are useful to compute the epigraphical projection of several functions of practical interest. The proposed proximal tools are numerically validated in the context of optimal query execution within database management systems, where the problem of selectivity estimation plays a central role. Experiments are carried out on small to large scale scenarios

    Centroid-Based Clustering with αβ-Divergences

    Get PDF
    Article number 196Centroid-based clustering is a widely used technique within unsupervised learning algorithms in many research fields. The success of any centroid-based clustering relies on the choice of the similarity measure under use. In recent years, most studies focused on including several divergence measures in the traditional hard k-means algorithm. In this article, we consider the problem of centroid-based clustering using the family of αβ-divergences, which is governed by two parameters, α and β. We propose a new iterative algorithm, αβ-k-means, giving closed-form solutions for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to converge to local minima for a wide range of values of the pair (α, β). Our theoretical contribution has been validated by several experiments performed with synthetic and real data and exploring the (α, β) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to be used in several practical applicationsMinisterio de Economía y Competitividad de España (MINECO) TEC2017-82807-

    Revisiting Chernoff Information with Likelihood Ratio Exponential Families

    Full text link
    The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications due to its empirical robustness property found in applications ranging from information fusion to quantum information. From the viewpoint of information theory, the Chernoff information can also be interpreted as a minmax symmetrization of the Kullback--Leibler divergence. In this paper, we first revisit the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families. Second, we show how to (i) solve exactly the Chernoff information between any two univariate Gaussian distributions or get a closed-form formula using symbolic computing, (ii) report a closed-form formula of the Chernoff information of centered Gaussians with scaled covariance matrices and (iii) use a fast numerical scheme to approximate the Chernoff information between any two multivariate Gaussian distributions.Comment: 41 page
    corecore