Search CORE

2,466 research outputs found

Bethe Projections for Non-Local Inference

Author: Belanger David
McCallum Andrew
Sheldon Daniel
Vilnis Luke
Publication venue
Publication date: 28/11/2016
Field of study

Many inference problems in structured prediction are naturally solved by augmenting a tractable dependency structure with complex, non-local auxiliary objectives. This includes the mean field family of variational inference algorithms, soft- or hard-constrained inference using Lagrangian relaxation or linear programming, collective graphical models, and forms of semi-supervised learning such as posterior regularization. We present a method to discriminatively learn broad families of inference objectives, capturing powerful non-local statistics of the latent variables, while maintaining tractable and provably fast inference using non-Euclidean projected gradient descent with a distance-generating function given by the Bethe entropy. We demonstrate the performance and flexibility of our method by (1) extracting structured citations from research papers by learning soft global constraints, (2) achieving state-of-the-art results on a widely-used handwriting recognition task using a novel learned non-convex inference procedure, and (3) providing a fast and highly scalable algorithm for the challenging problem of inference in a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201

arXiv.org e-Print Archive

CiteSeerX

Quasi-concave density estimation

Author: Koenker Roger
Mizera Ivan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 15/11/2010
Field of study

Maximum likelihood estimation of a log-concave probability density is formulated as a convex optimization problem and shown to have an equivalent dual formulation as a constrained maximum Shannon entropy problem. Closely related maximum Renyi entropy estimators that impose weaker concavity restrictions on the fitted density are also considered, notably a minimum Hellinger discrepancy estimator that constrains the reciprocal of the square-root of the density to be concave. A limiting form of these estimators constrains solutions to the class of quasi-concave densities.Comment: Published in at http://dx.doi.org/10.1214/10-AOS814 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Revisiting maximum-a-posteriori estimation in log-concave models

Author: Pereyra Marcelo
Publication venue
Publication date: 18/01/2019
Field of study

Maximum-a-posteriori (MAP) estimation is the main Bayesian estimation methodology in imaging sciences, where high dimensionality is often addressed by using Bayesian models that are log-concave and whose posterior mode can be computed efficiently by convex optimisation. Despite its success and wide adoption, MAP estimation is not theoretically well understood yet. The prevalent view in the community is that MAP estimation is not proper Bayesian estimation in a decision-theoretic sense because it does not minimise a meaningful expected loss function (unlike the minimum mean squared error (MMSE) estimator that minimises the mean squared loss). This paper addresses this theoretical gap by presenting a decision-theoretic derivation of MAP estimation in Bayesian models that are log-concave. A main novelty is that our analysis is based on differential geometry, and proceeds as follows. First, we use the underlying convex geometry of the Bayesian model to induce a Riemannian geometry on the parameter space. We then use differential geometry to identify the so-called natural or canonical loss function to perform Bayesian point estimation in that Riemannian manifold. For log-concave models, this canonical loss is the Bregman divergence associated with the negative log posterior density. We then show that the MAP estimator is the only Bayesian estimator that minimises the expected canonical loss, and that the posterior mean or MMSE estimator minimises the dual canonical loss. We also study the question of MAP and MSSE estimation performance in large scales and establish a universal bound on the expected canonical error as a function of dimension, offering new insights into the good performance observed in convex problems. These results provide a new understanding of MAP and MMSE estimation in log-concave settings, and of the multiple roles that convex geometry plays in imaging problems.Comment: Accepted for publication in SIAM Imaging Science

arXiv.org e-Print Archive

Heriot Watt Pure

A Lower Bound on the Entropy Rate for a Large Class of Stationary Processes and its Relation to the Hyperplane Conjecture

Author: Dörpinghaus Meik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/08/2017
Field of study

We present a new lower bound on the differential entropy rate of stationary processes whose sequences of probability density functions fulfill certain regularity conditions. This bound is obtained by showing that the gap between the differential entropy rate of such a process and the differential entropy rate of a Gaussian process with the same autocovariance function is bounded. This result is based on a recent result on bounding the Kullback-Leibler divergence by the Wasserstein distance given by Polyanskiy and Wu. Moreover, it is related to the famous hyperplane conjecture, also known as slicing problem, in convex geometry originally stated by J. Bourgain. Based on an entropic formulation of the hyperplane conjecture given by Bobkov and Madiman we discuss the relation of our result to the hyperplane conjecture.Comment: presented at the 2016 IEEE Information Theory Workshop (ITW), Cambridge, U

arXiv.org e-Print Archive

Crossref

Concentration of the information in data with log-concave distributions

Author: Bobkov Sergey
Madiman Mokshay
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2011
Field of study

A concentration property of the functional

{-}\log f(X)

is demonstrated, when a random vector X has a log-concave density f on

\mathbb{R}^n

. This concentration property implies in particular an extension of the Shannon-McMillan-Breiman strong ergodic theorem to the class of discrete-time stochastic processes with log-concave marginals.Comment: Published in at http://dx.doi.org/10.1214/10-AOP592 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref