1,666 research outputs found
Densities of almost-surely terminating probabilistic programs are differentiable almost everywhere
We study the differential properties of higher-order statistical
probabilistic programs with recursion and conditioning. Our starting point is
an open problem posed by Hongseok Yang: what class of statistical probabilistic
programs have densities that are differentiable almost everywhere? To formalise
the problem, we consider Statistical PCF (SPCF), an extension of call-by-value
PCF with real numbers, and constructs for sampling and conditioning. We give
SPCF a sampling-style operational semantics a la Borgstrom et al., and study
the associated weight (commonly referred to as the density) function and value
function on the set of possible execution traces. Our main result is that
almost-surely terminating SPCF programs, generated from a set of primitive
functions (e.g. the set of analytic functions) satisfying mild closure
properties, have weight and value functions that are almost-everywhere
differentiable. We use a stochastic form of symbolic execution to reason about
almost-everywhere differentiability. A by-product of this work is that
almost-surely terminating deterministic (S)PCF programs with real parameters
denote functions that are almost-everywhere differentiable. Our result is of
practical interest, as almost-everywhere differentiability of the density
function is required to hold for the correctness of major gradient-based
inference algorithms
Template-Based Static Posterior Inference for Bayesian Probabilistic Programming
In Bayesian probabilistic programming, a central problem is to estimate the
normalised posterior distribution (NPD) of a probabilistic program with
conditioning. Prominent approximate approaches to address this problem include
Markov chain Monte Carlo and variational inference, but neither can generate
guaranteed outcomes within limited time. Moreover, most existing formal
approaches that perform exact inference for NPD are restricted to programs with
closed-form solutions or bounded loops/recursion. A recent work (Beutner et
al., PLDI 2022) derived guaranteed bounds for NPD over programs with unbounded
recursion. However, as this approach requires recursion unrolling, it suffers
from the path explosion problem. Furthermore, previous approaches do not
consider score-recursive probabilistic programs that allow score statements
inside loops, which is non-trivial and requires careful treatment to ensure the
integrability of the normalising constant in NPD.
In this work, we propose a novel automated approach to derive bounds for NPD
via polynomial templates. Our approach can handle probabilistic programs with
unbounded while loops and continuous distributions with infinite supports. The
novelties in our approach are three-fold: First, we use polynomial templates to
circumvent the path explosion problem from recursion unrolling; Second, we
derive a novel multiplicative variant of Optional Stopping Theorem that
addresses the integrability issue in score-recursive programs; Third, to
increase the accuracy of the derived bounds via polynomial templates, we
propose a novel technique of truncation that truncates a program into a bounded
range of program values. Experiments over a wide range of benchmarks
demonstrate that our approach is time-efficient and can derive bounds for NPD
that are comparable with (or tighter than) the recursion-unrolling approach
(Beutner et al., PLDI 2022)
Barrier Frank-Wolfe for Marginal Inference
We introduce a globally-convergent algorithm for optimizing the
tree-reweighted (TRW) variational objective over the marginal polytope. The
algorithm is based on the conditional gradient method (Frank-Wolfe) and moves
pseudomarginals within the marginal polytope through repeated maximum a
posteriori (MAP) calls. This modular structure enables us to leverage black-box
MAP solvers (both exact and approximate) for variational inference, and obtains
more accurate results than tree-reweighted algorithms that optimize over the
local consistency relaxation. Theoretically, we bound the sub-optimality for
the proposed algorithm despite the TRW objective having unbounded gradients at
the boundary of the marginal polytope. Empirically, we demonstrate the
increased quality of results found by tightening the relaxation over the
marginal polytope as well as the spanning tree polytope on synthetic and
real-world instances.Comment: 25 pages, 12 figures, To appear in Neural Information Processing
Systems (NIPS) 2015, Corrected reference and cleaned up bibliograph
Automatic Backward Filtering Forward Guiding for Markov processes and graphical models
We incorporate discrete and continuous time Markov processes as building
blocks into probabilistic graphical models with latent and observed variables.
We introduce the automatic Backward Filtering Forward Guiding (BFFG) paradigm
(Mider et al., 2020) for programmable inference on latent states and model
parameters. Our starting point is a generative model, a forward description of
the probabilistic process dynamics. We backpropagate the information provided
by observations through the model to transform the generative (forward) model
into a pre-conditional model guided by the data. It approximates the actual
conditional model with known likelihood-ratio between the two. The backward
filter and the forward change of measure are suitable to be incorporated into a
probabilistic programming context because they can be formulated as a set of
transformation rules.
The guided generative model can be incorporated in different approaches to
efficiently sample latent states and parameters conditional on observations. We
show applicability in a variety of settings, including Markov chains with
discrete state space, interacting particle systems, state space models,
branching diffusions and Gamma processes
- …