245 research outputs found
The Burbea-Rao and Bhattacharyya centroids
We study the centroid with respect to the class of information-theoretic
Burbea-Rao divergences that generalize the celebrated Jensen-Shannon divergence
by measuring the non-negative Jensen difference induced by a strictly convex
and differentiable function. Although those Burbea-Rao divergences are
symmetric by construction, they are not metric since they fail to satisfy the
triangle inequality. We first explain how a particular symmetrization of
Bregman divergences called Jensen-Bregman distances yields exactly those
Burbea-Rao divergences. We then proceed by defining skew Burbea-Rao
divergences, and show that skew Burbea-Rao divergences amount in limit cases to
compute Bregman divergences. We then prove that Burbea-Rao centroids are
unique, and can be arbitrarily finely approximated by a generic iterative
concave-convex optimization algorithm with guaranteed convergence property. In
the second part of the paper, we consider the Bhattacharyya distance that is
commonly used to measure overlapping degree of probability distributions. We
show that Bhattacharyya distances on members of the same statistical
exponential family amount to calculate a Burbea-Rao divergence in disguise.
Thus we get an efficient algorithm for computing the Bhattacharyya centroid of
a set of parametric distributions belonging to the same exponential families,
improving over former specialized methods found in the literature that were
limited to univariate or "diagonal" multivariate Gaussians. To illustrate the
performance of our Bhattacharyya/Burbea-Rao centroid algorithm, we present
experimental performance results for -means and hierarchical clustering
methods of Gaussian mixture models.Comment: 13 page
Centroid-Based Clustering with ab-Divergences
Centroid-based clustering is a widely used technique within unsupervised learning
algorithms in many research fields. The success of any centroid-based clustering relies on the
choice of the similarity measure under use. In recent years, most studies focused on including several
divergence measures in the traditional hard k-means algorithm. In this article, we consider the
problem of centroid-based clustering using the family of ab-divergences, which is governed by two
parameters, a and b. We propose a new iterative algorithm, ab-k-means, giving closed-form solutions
for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of
values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to
converge to local minima for a wide range of values of the pair (a, b). Our theoretical contribution
has been validated by several experiments performed with synthetic and real data and exploring the
(a, b) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to
be used in several practical applications.MINECO TEC2017-82807-
Cramer-Rao Lower Bound and Information Geometry
This article focuses on an important piece of work of the world renowned
Indian statistician, Calyampudi Radhakrishna Rao. In 1945, C. R. Rao (25 years
old then) published a pathbreaking paper, which had a profound impact on
subsequent statistical research.Comment: To appear in Connected at Infinity II: On the work of Indian
mathematicians (R. Bhatia and C.S. Rajan, Eds.), special volume of Texts and
Readings In Mathematics (TRIM), Hindustan Book Agency, 201
Proximity Operators of Discrete Information Divergences
Information divergences allow one to assess how close two distributions are
from each other. Among the large panel of available measures, a special
attention has been paid to convex -divergences, such as
Kullback-Leibler, Jeffreys-Kullback, Hellinger, Chi-Square, Renyi, and
I divergences. While -divergences have been extensively
studied in convex analysis, their use in optimization problems often remains
challenging. In this regard, one of the main shortcomings of existing methods
is that the minimization of -divergences is usually performed with
respect to one of their arguments, possibly within alternating optimization
techniques. In this paper, we overcome this limitation by deriving new
closed-form expressions for the proximity operator of such two-variable
functions. This makes it possible to employ standard proximal methods for
efficiently solving a wide range of convex optimization problems involving
-divergences. In addition, we show that these proximity operators are
useful to compute the epigraphical projection of several functions of practical
interest. The proposed proximal tools are numerically validated in the context
of optimal query execution within database management systems, where the
problem of selectivity estimation plays a central role. Experiments are carried
out on small to large scale scenarios
Centroid-Based Clustering with αβ-Divergences
Article number 196Centroid-based clustering is a widely used technique within unsupervised learning
algorithms in many research fields. The success of any centroid-based clustering relies on the
choice of the similarity measure under use. In recent years, most studies focused on including several
divergence measures in the traditional hard k-means algorithm. In this article, we consider the
problem of centroid-based clustering using the family of αβ-divergences, which is governed by two
parameters, α and β. We propose a new iterative algorithm, αβ-k-means, giving closed-form solutions
for the computation of the sided centroids. The algorithm can be fine-tuned by means of this pair of
values, yielding a wide range of the most frequently used divergences. Moreover, it is guaranteed to
converge to local minima for a wide range of values of the pair (α, β). Our theoretical contribution
has been validated by several experiments performed with synthetic and real data and exploring the
(α, β) plane. The numerical results obtained confirm the quality of the algorithm and its suitability to
be used in several practical applicationsMinisterio de Economía y Competitividad de España (MINECO) TEC2017-82807-
Revisiting Chernoff Information with Likelihood Ratio Exponential Families
The Chernoff information between two probability measures is a statistical
divergence measuring their deviation defined as their maximally skewed
Bhattacharyya distance. Although the Chernoff information was originally
introduced for bounding the Bayes error in statistical hypothesis testing, the
divergence found many other applications due to its empirical robustness
property found in applications ranging from information fusion to quantum
information. From the viewpoint of information theory, the Chernoff information
can also be interpreted as a minmax symmetrization of the Kullback--Leibler
divergence. In this paper, we first revisit the Chernoff information between
two densities of a measurable Lebesgue space by considering the exponential
families induced by their geometric mixtures: The so-called likelihood ratio
exponential families. Second, we show how to (i) solve exactly the Chernoff
information between any two univariate Gaussian distributions or get a
closed-form formula using symbolic computing, (ii) report a closed-form formula
of the Chernoff information of centered Gaussians with scaled covariance
matrices and (iii) use a fast numerical scheme to approximate the Chernoff
information between any two multivariate Gaussian distributions.Comment: 41 page
- …