115 research outputs found
Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities
Information-theoretic measures such as the entropy, cross-entropy and the
Kullback-Leibler divergence between two mixture models is a core primitive in
many signal processing tasks. Since the Kullback-Leibler divergence of mixtures
provably does not admit a closed-form formula, it is in practice either
estimated using costly Monte-Carlo stochastic integration, approximated, or
bounded using various techniques. We present a fast and generic method that
builds algorithmically closed-form lower and upper bounds on the entropy, the
cross-entropy and the Kullback-Leibler divergence of mixtures. We illustrate
the versatile method by reporting on our experiments for approximating the
Kullback-Leibler divergence between univariate exponential mixtures, Gaussian
mixtures, Rayleigh mixtures, and Gamma mixtures.Comment: 20 pages, 3 figure
On a generalization of the Jensen-Shannon divergence
The Jensen-Shannon divergence is a renown bounded symmetrization of the
Kullback-Leibler divergence which does not require probability densities to
have matching supports. In this paper, we introduce a vector-skew
generalization of the scalar -Jensen-Bregman divergences and derive
thereof the vector-skew -Jensen-Shannon divergences. We study the
properties of these novel divergences and show how to build parametric families
of symmetric Jensen-Shannon-type divergences. Finally, we report an iterative
algorithm to numerically compute the Jensen-Shannon-type centroids for a set of
probability densities belonging to a mixture family: This includes the case of
the Jensen-Shannon centroid of a set of categorical distributions or normalized
histograms.Comment: 19 pages, 3 figure
-MLE: A fast algorithm for learning statistical mixture models
We describe -MLE, a fast and efficient local search algorithm for learning
finite statistical mixtures of exponential families such as Gaussian mixture
models. Mixture models are traditionally learned using the
expectation-maximization (EM) soft clustering technique that monotonically
increases the incomplete (expected complete) likelihood. Given prescribed
mixture weights, the hard clustering -MLE algorithm iteratively assigns data
to the most likely weighted component and update the component models using
Maximum Likelihood Estimators (MLEs). Using the duality between exponential
families and Bregman divergences, we prove that the local convergence of the
complete likelihood of -MLE follows directly from the convergence of a dual
additively weighted Bregman hard clustering. The inner loop of -MLE can be
implemented using any -means heuristic like the celebrated Lloyd's batched
or Hartigan's greedy swap updates. We then show how to update the mixture
weights by minimizing a cross-entropy criterion that implies to update weights
by taking the relative proportion of cluster points, and reiterate the mixture
parameter update and mixture weight update processes until convergence. Hard EM
is interpreted as a special case of -MLE when both the component update and
the weight update are performed successively in the inner loop. To initialize
-MLE, we propose -MLE++, a careful initialization of -MLE guaranteeing
probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201
- …