161 research outputs found
Entropic Priors for Discrete Probabilistic Networks and for Mixtures of Gaussians Models
The ongoing unprecedented exponential explosion of available computing power,
has radically transformed the methods of statistical inference. What used to be
a small minority of statisticians advocating for the use of priors and a strict
adherence to bayes theorem, it is now becoming the norm across disciplines. The
evolutionary direction is now clear. The trend is towards more realistic,
flexible and complex likelihoods characterized by an ever increasing number of
parameters. This makes the old question of: What should the prior be? to
acquire a new central importance in the modern bayesian theory of inference.
Entropic priors provide one answer to the problem of prior selection. The
general definition of an entropic prior has existed since 1988, but it was not
until 1998 that it was found that they provide a new notion of complete
ignorance. This paper re-introduces the family of entropic priors as minimizers
of mutual information between the data and the parameters, as in
[rodriguez98b], but with a small change and a correction. The general formalism
is then applied to two large classes of models: Discrete probabilistic networks
and univariate finite mixtures of gaussians. It is also shown how to perform
inference by efficiently sampling the corresponding posterior distributions.Comment: 24 pages, 3 figures, Presented at MaxEnt2001, APL Johns Hopkins
University, August 4-9 2001. See also http://omega.albany.edu:8008
DPO - Denoising, Deconvolving, and Decomposing Photon Observations
The analysis of astronomical images is a non-trivial task. The D3PO algorithm
addresses the inference problem of denoising, deconvolving, and decomposing
photon observations. Its primary goal is the simultaneous but individual
reconstruction of the diffuse and point-like photon flux given a single photon
count image, where the fluxes are superimposed. In order to discriminate
between these morphologically different signal components, a probabilistic
algorithm is derived in the language of information field theory based on a
hierarchical Bayesian parameter model. The signal inference exploits prior
information on the spatial correlation structure of the diffuse component and
the brightness distribution of the spatially uncorrelated point-like sources. A
maximum a posteriori solution and a solution minimizing the Gibbs free energy
of the inference problem using variational Bayesian methods are discussed.
Since the derivation of the solution is not dependent on the underlying
position space, the implementation of the D3PO algorithm uses the NIFTY package
to ensure applicability to various spatial grids and at any resolution. The
fidelity of the algorithm is validated by the analysis of simulated data,
including a realistic high energy photon count image showing a 32 x 32 arcmin^2
observation with a spatial resolution of 0.1 arcmin. In all tests the D3PO
algorithm successfully denoised, deconvolved, and decomposed the data into a
diffuse and a point-like signal estimate for the respective photon flux
components.Comment: 22 pages, 8 figures, 2 tables, accepted by Astronomy & Astrophysics;
refereed version, 1 figure added, results unchanged, software available at
http://www.mpa-garching.mpg.de/ift/d3po
Variational Bayesian multinomial probit regression with Gaussian process priors
It is well known in the statistics literature that augmenting binary and polychotomous response models with Gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression coefficients in favour of Gaussian Process (GP) priors over functions, and employing variational approximations to the full posterior we obtain efficient computational methods for Gaussian Process classification in the multi-class setting. The model augmentation with additional latent variables ensures full a posteriori class coupling whilst retaining the simple a priori independent GP covariance structure from which sparse approximations, such as multi-class Informative Vector Machines (IVM), emerge in a very natural and straightforward manner. This is the first time that a fully Variational Bayesian treatment for multi-class GP classification has been developed without having to resort to additional explicit approximations to the non-Gaussian likelihood term. Empirical comparisons with exact analysis via MCMC and Laplace approximations illustrate the utility of the variational approximation as a computationally economic alternative to full MCMC and it is shown to be more accurate than the Laplace approximation
Revisiting maximum-a-posteriori estimation in log-concave models
Maximum-a-posteriori (MAP) estimation is the main Bayesian estimation
methodology in imaging sciences, where high dimensionality is often addressed
by using Bayesian models that are log-concave and whose posterior mode can be
computed efficiently by convex optimisation. Despite its success and wide
adoption, MAP estimation is not theoretically well understood yet. The
prevalent view in the community is that MAP estimation is not proper Bayesian
estimation in a decision-theoretic sense because it does not minimise a
meaningful expected loss function (unlike the minimum mean squared error (MMSE)
estimator that minimises the mean squared loss). This paper addresses this
theoretical gap by presenting a decision-theoretic derivation of MAP estimation
in Bayesian models that are log-concave. A main novelty is that our analysis is
based on differential geometry, and proceeds as follows. First, we use the
underlying convex geometry of the Bayesian model to induce a Riemannian
geometry on the parameter space. We then use differential geometry to identify
the so-called natural or canonical loss function to perform Bayesian point
estimation in that Riemannian manifold. For log-concave models, this canonical
loss is the Bregman divergence associated with the negative log posterior
density. We then show that the MAP estimator is the only Bayesian estimator
that minimises the expected canonical loss, and that the posterior mean or MMSE
estimator minimises the dual canonical loss. We also study the question of MAP
and MSSE estimation performance in large scales and establish a universal bound
on the expected canonical error as a function of dimension, offering new
insights into the good performance observed in convex problems. These results
provide a new understanding of MAP and MMSE estimation in log-concave settings,
and of the multiple roles that convex geometry plays in imaging problems.Comment: Accepted for publication in SIAM Imaging Science
- …