2,466 research outputs found
Bethe Projections for Non-Local Inference
Many inference problems in structured prediction are naturally solved by
augmenting a tractable dependency structure with complex, non-local auxiliary
objectives. This includes the mean field family of variational inference
algorithms, soft- or hard-constrained inference using Lagrangian relaxation or
linear programming, collective graphical models, and forms of semi-supervised
learning such as posterior regularization. We present a method to
discriminatively learn broad families of inference objectives, capturing
powerful non-local statistics of the latent variables, while maintaining
tractable and provably fast inference using non-Euclidean projected gradient
descent with a distance-generating function given by the Bethe entropy. We
demonstrate the performance and flexibility of our method by (1) extracting
structured citations from research papers by learning soft global constraints,
(2) achieving state-of-the-art results on a widely-used handwriting recognition
task using a novel learned non-convex inference procedure, and (3) providing a
fast and highly scalable algorithm for the challenging problem of inference in
a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201
Quasi-concave density estimation
Maximum likelihood estimation of a log-concave probability density is
formulated as a convex optimization problem and shown to have an equivalent
dual formulation as a constrained maximum Shannon entropy problem. Closely
related maximum Renyi entropy estimators that impose weaker concavity
restrictions on the fitted density are also considered, notably a minimum
Hellinger discrepancy estimator that constrains the reciprocal of the
square-root of the density to be concave. A limiting form of these estimators
constrains solutions to the class of quasi-concave densities.Comment: Published in at http://dx.doi.org/10.1214/10-AOS814 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Revisiting maximum-a-posteriori estimation in log-concave models
Maximum-a-posteriori (MAP) estimation is the main Bayesian estimation
methodology in imaging sciences, where high dimensionality is often addressed
by using Bayesian models that are log-concave and whose posterior mode can be
computed efficiently by convex optimisation. Despite its success and wide
adoption, MAP estimation is not theoretically well understood yet. The
prevalent view in the community is that MAP estimation is not proper Bayesian
estimation in a decision-theoretic sense because it does not minimise a
meaningful expected loss function (unlike the minimum mean squared error (MMSE)
estimator that minimises the mean squared loss). This paper addresses this
theoretical gap by presenting a decision-theoretic derivation of MAP estimation
in Bayesian models that are log-concave. A main novelty is that our analysis is
based on differential geometry, and proceeds as follows. First, we use the
underlying convex geometry of the Bayesian model to induce a Riemannian
geometry on the parameter space. We then use differential geometry to identify
the so-called natural or canonical loss function to perform Bayesian point
estimation in that Riemannian manifold. For log-concave models, this canonical
loss is the Bregman divergence associated with the negative log posterior
density. We then show that the MAP estimator is the only Bayesian estimator
that minimises the expected canonical loss, and that the posterior mean or MMSE
estimator minimises the dual canonical loss. We also study the question of MAP
and MSSE estimation performance in large scales and establish a universal bound
on the expected canonical error as a function of dimension, offering new
insights into the good performance observed in convex problems. These results
provide a new understanding of MAP and MMSE estimation in log-concave settings,
and of the multiple roles that convex geometry plays in imaging problems.Comment: Accepted for publication in SIAM Imaging Science
A Lower Bound on the Entropy Rate for a Large Class of Stationary Processes and its Relation to the Hyperplane Conjecture
We present a new lower bound on the differential entropy rate of stationary
processes whose sequences of probability density functions fulfill certain
regularity conditions. This bound is obtained by showing that the gap between
the differential entropy rate of such a process and the differential entropy
rate of a Gaussian process with the same autocovariance function is bounded.
This result is based on a recent result on bounding the Kullback-Leibler
divergence by the Wasserstein distance given by Polyanskiy and Wu. Moreover, it
is related to the famous hyperplane conjecture, also known as slicing problem,
in convex geometry originally stated by J. Bourgain. Based on an entropic
formulation of the hyperplane conjecture given by Bobkov and Madiman we discuss
the relation of our result to the hyperplane conjecture.Comment: presented at the 2016 IEEE Information Theory Workshop (ITW),
Cambridge, U
Concentration of the information in data with log-concave distributions
A concentration property of the functional is demonstrated,
when a random vector X has a log-concave density f on . This
concentration property implies in particular an extension of the
Shannon-McMillan-Breiman strong ergodic theorem to the class of discrete-time
stochastic processes with log-concave marginals.Comment: Published in at http://dx.doi.org/10.1214/10-AOP592 the Annals of
Probability (http://www.imstat.org/aop/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …