6 research outputs found
The Bregman chord divergence
Distances are fundamental primitives whose choice significantly impacts the
performances of algorithms in machine learning and signal processing. However
selecting the most appropriate distance for a given task is an endeavor.
Instead of testing one by one the entries of an ever-expanding dictionary of
{\em ad hoc} distances, one rather prefers to consider parametric classes of
distances that are exhaustively characterized by axioms derived from first
principles. Bregman divergences are such a class. However fine-tuning a Bregman
divergence is delicate since it requires to smoothly adjust a functional
generator. In this work, we propose an extension of Bregman divergences called
the Bregman chord divergences. This new class of distances does not require
gradient calculations, uses two scalar parameters that can be easily tailored
in applications, and generalizes asymptotically Bregman divergences.Comment: 10 page
On a generalization of the Jensen-Shannon divergence and the JS-symmetrization of distances relying on abstract means
The Jensen-Shannon divergence is a renown bounded symmetrization of the
unbounded Kullback-Leibler divergence which measures the total Kullback-Leibler
divergence to the average mixture distribution. However the Jensen-Shannon
divergence between Gaussian distributions is not available in closed-form. To
bypass this problem, we present a generalization of the Jensen-Shannon (JS)
divergence using abstract means which yields closed-form expressions when the
mean is chosen according to the parametric family of distributions. More
generally, we define the JS-symmetrizations of any distance using generalized
statistical mixtures derived from abstract means. In particular, we first show
that the geometric mean is well-suited for exponential families, and report two
closed-form formula for (i) the geometric Jensen-Shannon divergence between
probability densities of the same exponential family, and (ii) the geometric
JS-symmetrization of the reverse Kullback-Leibler divergence. As a second
illustrating example, we show that the harmonic mean is well-suited for the
scale Cauchy distributions, and report a closed-form formula for the harmonic
Jensen-Shannon divergence between scale Cauchy distributions. We also define
generalized Jensen-Shannon divergences between matrices (e.g., quantum
Jensen-Shannon divergences) and consider clustering with respect to these novel
Jensen-Shannon divergences.Comment: 30 page
On a generalization of the Jensen-Shannon divergence
The Jensen-Shannon divergence is a renown bounded symmetrization of the
Kullback-Leibler divergence which does not require probability densities to
have matching supports. In this paper, we introduce a vector-skew
generalization of the scalar -Jensen-Bregman divergences and derive
thereof the vector-skew -Jensen-Shannon divergences. We study the
properties of these novel divergences and show how to build parametric families
of symmetric Jensen-Shannon-type divergences. Finally, we report an iterative
algorithm to numerically compute the Jensen-Shannon-type centroids for a set of
probability densities belonging to a mixture family: This includes the case of
the Jensen-Shannon centroid of a set of categorical distributions or normalized
histograms.Comment: 19 pages, 3 figure
Divergence Measures
Data science, information theory, probability theory, statistical learning and other related disciplines greatly benefit from non-negative measures of dissimilarity between pairs of probability measures. These are known as divergence measures, and exploring their mathematical foundations and diverse applications is of significant interest. The present Special Issue, entitled “Divergence Measures: Mathematical Foundations and Applications in Information-Theoretic and Statistical Problems”, includes eight original contributions, and it is focused on the study of the mathematical properties and applications of classical and generalized divergence measures from an information-theoretic perspective. It mainly deals with two key generalizations of the relative entropy: namely, the R_ényi divergence and the important class of f -divergences. It is our hope that the readers will find interest in this Special Issue, which will stimulate further research in the study of the mathematical foundations and applications of divergence measures
Information auxiliaire non paramétrique : géométrie de l'information, processus empirique et applications
Le premier objectif de cette thèse a été de développer une méthode d'injection optimale d'une information auxiliaire c'est à dire une information extérieure à l'expérience statistique observée. Ainsi nous proposons de rechercher la mesure de probabilité discrète ayant pour support l'échantillon vérifiant l'information auxiliaire et qui soit la plus proche de la mesure empirique au sens de la géométrie de l'information. Nous prouvons qu'il existe deux solutions à ce problème et nous montrons qu'il existe aussi une approximation commune explicite et facilement calculable de ces deux solutions. Nous la définissons comme étant la mesure empirique informée. Par la suite, nous étudions les propriétés de cette mesure empirique informée en établissant des résultats non asymptotiques de concentration et asymptotique du type P-Glivenko-Cantelli et P-Donsker. La conséquence majeure d'une information exacte est la diminution uniforme de la variance asymptotique du processus empirique. Le deuxième objectif de cette thèse a été de généraliser la mesure empirique informée à des informations auxiliaires dites faibles (préférence d'un expert par exemple). Après avoir généralisé la mesure empirique informée à ces informations auxiliaires faibles, nous lui étendons des résultats asymptotiques du type P-Glivenko-Cantelli et P-Donsker. Le troisième objectif de cette thèse a été d'étudier l'impact de l'utilisation d'une information auxiliaire fausse et de mettre en place une procédure adaptative permettant de sélectionner les informations auxiliaires pertinentes pour l'estimation d'un paramètre d'intérêt. Enfin nous terminons cette thèse en proposant des applications de la mesure empirique informée à des problèmes variés issus de la statistique et de la simulation stochastique (utilisation de variables auxiliaires, amélioration de la méthode de Monte-Carlo, maximum de vraisemblance informé, moindres carrés avec informations auxiliaires, apprentissage etc). Ces applications illustrent l'impact positif et relativement immédiat d'une information auxiliaire.The first aim of this thesis was to develop a method for the optimal use of auxiliary information, i.e. information external to the observed statistical experiment. Thus we propose to search for the discrete probability measure that has as support the sample verifying the auxiliary information and that is the closest to the empirical measure in the sense of the information geometry. We prove that there are two solutions to this problem and we show that there is also a common approximation of these two solutions allowing to define the informed empirical measure. Subsequently, we study the properties of this informed empirical measure by establishing non-asymptotic concentration and asymptotic results of the P-Glivenko-Cantelli and P-Donsker type. The major consequence of accurate information is the uniform decrease in the asymptotic variance of the empirical process. The second aim of this thesis was to generalize the empirical informed measure to so-called weak auxiliary information (e.g. expert preference). After generalizing the empirical informed measure to these weak auxiliary information, we establish asymptotic results of the type P-Glivenko-Cantelli and P-Donsker. The third aim of this thesis was to study the impact of using false auxiliary information and to set up an adaptive procedure to select the relevant auxiliary information for the estimation of a parameter of interest. Finally, we conclude this thesis by proposing applications of the informed empirical measure to various problems in statistics and stochastic simulation (use of auxiliary variables, improvement of the Monte Carlo method, informed maximum likelihood, least squares estimator with auxiliary information, learning, etc). These applications illustrate the positive and relatively immediate impact of auxiliary information