14,192 research outputs found
Reduction of Markov Chains using a Value-of-Information-Based Approach
In this paper, we propose an approach to obtain reduced-order models of
Markov chains. Our approach is composed of two information-theoretic processes.
The first is a means of comparing pairs of stationary chains on different state
spaces, which is done via the negative Kullback-Leibler divergence defined on a
model joint space. Model reduction is achieved by solving a
value-of-information criterion with respect to this divergence. Optimizing the
criterion leads to a probabilistic partitioning of the states in the high-order
Markov chain. A single free parameter that emerges through the optimization
process dictates both the partition uncertainty and the number of state groups.
We provide a data-driven means of choosing the `optimal' value of this free
parameter, which sidesteps needing to a priori know the number of state groups
in an arbitrary chain.Comment: Submitted to Entrop
Non-Reversible Parallel Tempering: a Scalable Highly Parallel MCMC Scheme
Parallel tempering (PT) methods are a popular class of Markov chain Monte
Carlo schemes used to sample complex high-dimensional probability
distributions. They rely on a collection of interacting auxiliary chains
targeting tempered versions of the target distribution to improve the
exploration of the state-space. We provide here a new perspective on these
highly parallel algorithms and their tuning by identifying and formalizing a
sharp divide in the behaviour and performance of reversible versus
non-reversible PT schemes. We show theoretically and empirically that a class
of non-reversible PT methods dominates its reversible counterparts and identify
distinct scaling limits for the non-reversible and reversible schemes, the
former being a piecewise-deterministic Markov process and the latter a
diffusion. These results are exploited to identify the optimal annealing
schedule for non-reversible PT and to develop an iterative scheme approximating
this schedule. We provide a wide range of numerical examples supporting our
theoretical and methodological contributions. The proposed methodology is
applicable to sample from a distribution with a density with respect
to a reference distribution and compute the normalizing constant. A
typical use case is when is a prior distribution, a likelihood
function and the corresponding posterior.Comment: 74 pages, 30 figures. The method is implemented in an open source
probabilistic programming available at
https://github.com/UBC-Stat-ML/blangSD
- …