17 research outputs found
Perfect Sampling of -Spin Systems on via Weak Spatial Mixing
We present a perfect marginal sampler of the unique Gibbs measure of a spin
system on . The algorithm is an adaptation of a previous `lazy
depth-first' approach by the authors, but relaxes the requirement of strong
spatial mixing to weak. It exploits a classical result in statistical physics
relating weak spatial mixing on to strong spatial mixing on
squares. When the spin system exhibits weak spatial mixing, the run-time of our
sampler is linear in the size of sample. Applications of note are the
ferromagnetic Potts model at supercritical temperatures, and the ferromagnetic
Ising model with consistent non-zero external field at any non-zero
temperature
Algorithms and complexity for approximately counting hypergraph colourings and related problems
The past decade has witnessed advancements in designing efficient algorithms for approximating the number of solutions to constraint satisfaction problems (CSPs), especially in the local lemma regime. However, the phase transition for the computational tractability is not known. This thesis is dedicated to the prototypical problem of this kind of CSPs, the hypergraph colouring. Parameterised by the number of colours q, the arity of each hyperedge k, and the vertex maximum degree Δ, this problem falls into the regime of Lovász local lemma when Δ ≲ qᵏ. In prior, however, fast approximate counting algorithms exist when Δ ≲ qᵏ/³, and there is no known inapproximability result. In pursuit of this, our contribution is two-folded, stated as follows.
• When q, k ≥ 4 are evens and Δ ≥ 5·qᵏ/², approximating the number of hypergraph colourings is NP-hard.
• When the input hypergraph is linear and Δ ≲ qᵏ/², a fast approximate counting algorithm does exist
The random geometry of equilibrium phases
This is a (long) survey about applications of percolation theory in
equilibrium statistical mechanics. The chapters are as follows:
1. Introduction
2. Equilibrium phases
3. Some models
4. Coupling and stochastic domination
5. Percolation
6. Random-cluster representations
7. Uniqueness and exponential mixing from non-percolation
8. Phase transition and percolation
9. Random interactions
10. Continuum modelsComment: 118 pages. Addresses: [email protected]
http://www.mathematik.uni-muenchen.de/~georgii.html [email protected]
http://www.math.chalmers.se/~olleh [email protected]
Contributions to MCMC Methods in Constrained Domains with Applications to Neuroimaging
Markov chain Monte Carlo (MCMC) methods form a rich class of computational techniques that help its user ascertain samples from target distributions when direct sampling is not possible or when their closed forms are intractable. Over the years, MCMC methods have been used in innumerable situations due to their flexibility and generalizability, even in situations involving nonlinear and/or highly parametrized models. In this dissertation, two major works relating to MCMC methods are presented.
The first involves the development of a method to identify the number and directions of nerve fibers using diffusion-weighted MRI measurements. For this, the biological problem is first formulated as a model selection and estimation problem. Using the framework of reversible jump MCMC, a novel Bayesian scheme that performs both the above tasks simultaneously using customizable priors and proposal distributions is proposed. The proposed method allows users to set a prior level of spatial separation between the nerve fibers, allowing more crossing paths to be detected when desired or a lower number to potentially only detect robust nerve tracts. Hence, estimation that is specific to a given region of interest within the brain can be performed. In simulated examples, the method has been shown to resolve up to four fibers even in instances of highly noisy data. Comparative analysis with other state-of-the-art methods on in-vivo data showed the method\u27s ability to detect more crossing nerve fibers.
The second work involves the construction of an MCMC algorithm that efficiently performs (Bayesian) sampling of parameters with support constraints. The method works by embedding a transformation called inversion in a sphere within the Metropolis-Hastings sampler. This creates an image of the constrained support that is amenable to sampling using standard proposals such as Gaussian. The proposed strategy is tested on three domains: the standard simplex, a sector of an n-sphere, and hypercubes. In each domain, a comparison is made with existing sampling techniques
Probabilistic Inference in Piecewise Graphical Models
In many applications of probabilistic inference the models
contain piecewise densities that are differentiable except at
partition boundaries. For instance, (1) some models may
intrinsically have finite support, being constrained to some
regions; (2) arbitrary density functions may be approximated by
mixtures of piecewise functions such as piecewise polynomials or
piecewise exponentials; (3) distributions derived from other
distributions (via random variable transformations) may be highly
piecewise; (4) in applications of Bayesian inference such as
Bayesian discrete classification and preference learning, the
likelihood functions may be piecewise; (5) context-specific
conditional probability density functions (tree-CPDs) are
intrinsically piecewise; (6) influence diagrams (generalizations
of Bayesian networks in which along with probabilistic inference,
decision making problems are modeled) are in many applications
piecewise; (7) in probabilistic programming, conditional
statements lead to piecewise models. As we will show, exact
inference on piecewise models is not often scalable (if
applicable) and the performance of the existing approximate
inference techniques on such models is usually quite poor.
This thesis fills this gap by presenting scalable and accurate
algorithms for inference in piecewise probabilistic graphical
models. Our first contribution is to present a variation of Gibbs
sampling algorithm that achieves an exponential sampling speedup
on a large class of models (including Bayesian models with
piecewise likelihood functions). As a second contribution, we
show that for a large range of models, the time-consuming Gibbs
sampling computations that are traditionally carried out per
sample, can be computed symbolically, once and prior to the
sampling process. Among many potential applications, the
resulting symbolic Gibbs sampler can be used for fully automated
reasoning in the presence of deterministic constraints among
random variables. As a third contribution, we are motivated by
the behavior of Hamiltonian dynamics in optics —in particular,
the reflection and refraction of light on the refractive
surfaces— to present a new Hamiltonian Monte Carlo method that
demonstrates a significantly improved performance on piecewise
models.
Hopefully, the present work represents a step towards scalable
and accurate inference in an important class of probabilistic
models that has largely been overlooked in the literature
Recommended from our members
Exploring Probability Measures with Markov Processes
In many domains where mathematical modelling is applied, a deterministic description of the system at hand is insufficient, and so it is useful to model systems as being in some way stochastic. This is often achieved by modeling the state of the system as being drawn from a probability measure, which is usually given algebraically, i.e. as a formula. While this representation can be useful for deriving certain characteristics of the system, it is by now well-appreciated that many questions about stochastic systems are best-answered by looking at samples from the associated probability measure. In this thesis, we seek to develop and analyse efficient techniques for generating samples from a given probability measure, with a focus on algorithms which simulate a Markov process with the desired invariant measure.
The first work presented in this thesis considers the use of Piecewise-Deterministic Markov Processes (PDMPs) for generating samples. In contrast to usual approaches, PDMPs are i) defined as continuous-time processes, and ii) are typically non-reversible with respect to their invariant measure. These distinctions pose computational and theoretical challenges for the design, analysis, and implementation of PDMP-based samplers. The key contribution of this work is to develop a transparent characterisation of how one can construct a PDMP (within the class of trajectorially-reversible processes) which admits the desired invariant measure, and to offer actionable recommendations on how these processes should be designed in practice.
The second work presented in this thesis considers the task of sampling from a probability measure on a discrete space. While work in recent years has made it possible to apply sampling algorithms to probability measures with differentiable densities on continuous spaces in a reasonably generic way, samplers on discrete spaces are still largely derived on a case-by-case basis. The contention of this work is that this is not necessary, and that one can in fact define quite generally-applicable algorithms which can sample efficiently from discrete probability measures. The contributions are then to propose a small collection of algorithms for this task, and verify their efficiency empirically. Building
on the previous chapter’s work, our samplers are again defined in continuous time and non-reversible, each of which offer noticeable benefits in efficiency.
The third work presented in this thesis concerns a theoretical study of a particular class of Markov Chain-based sampling algorithms which make use of parallel computing resources. The Markov Chains which are produced by this algorithm are mathematically equivalent to a standard Metropolis-Hastings chain, but their real-time convergence properties are affected nontrivially by the application of parallelism. The contribution of this work is to analyse the convergence behaviour of these chains, and to use the ‘optimal scaling’ framework (as developed by Roberts, Rosenthal, and others) to make recommendations concerning the tuning of such algorithms in practice.
The introductory chapters provide a general overview on the task of generating samples from a probability measure, with particular focus on methods involving Markov processes. There is also an interlude on the relative benefits of i) continuous-time and ii) non-reversible Markov processes for sampling, which are intended to provide additional context for the reading of the first two works.PhD Studentship paid for by Cantab Capital Institute for the Mathematics of Informatio
Recommended from our members
Monte Carlo Methods in Practice and Efficiency Enhancements via Parallel Computation
Monte Carlo methods are crucial when dealing with advanced problems in Bayesian inference. Indeed, common approaches such as Markov chain Monte Carlo (MCMC) and sequential Monte Carlo (SMC) can be endlessly adapted to tackle the most complex problems. What is important then is to construct efficient algorithms, and significant attention in the literature is devoted to developing algorithms that mix well, have low computational complexity and can scale up to large datasets. One of the most commonly used and straightforward approaches is to speed up Monte Carlo algorithms by running them in parallel computing environments. The compute time of Monte Carlo algorithms is random and can vary depending on the current state of the Markov chain. Other computing-infrastructure related factors, such as competing jobs on the same processor, or memory bandwidth, which are prevalent in shared computing architectures such as cloud computing, can also affect this compute time. However, many algorithms running in parallel require the processors to communicate every so often, and for that we must ensure that they are simultaneously ready and any idle wait time is minimised. This can be done by employing a framework known as Anytime Monte Carlo, which imposes a real-time deadline on parallel computations.
The contributions in this thesis include novel applications of the Anytime framework to construct efficient Anytime MCMC and SMC algorithms which make use of parallel computing in order to perform inference for advanced problems. Examples of such problems investigated include models in which the likelihood cannot be evaluated analytically, and changepoint models, which are often used to model the heterogeneity of sequential data, but tricky to infer upon due to the unknown number and locations of the changepoints. This thesis also focuses on the difficult task of performing parameter inference in single-molecule microscopy, a category of models in which the arrival rate of observations is not uniformly distributed and measurement models have complex forms. These issues are exacerbated when molecules have trajectories described by stochastic differential equations.
The original contributions of this thesis are organised in Chapters 4-6. Chapter 4 shows the development of a novel Anytime parallel tempering algorithm and demonstrates the performance enhancements the Anytime framework brings to parallel tempering, an algorithm, which runs multiple interacting MCMC chains in order to more efficiently explore the state space. In Chapter 5, a general Anytime SMC sampler is developed for performing changepoint inference using reversible jump MCMC (RJ-MCMC), an algorithm that takes into account the unknown number of changepoints by including transdimensional MCMC updates. The workings of the algorithm are illustrated on a particularly complex changepoint model, and once again the improvements in performance brought by employing the Anytime framework are demonstrated. Chapter 6 moves away from the Anytime framework, and presents a novel and general SMC approach to performing parameter inference for molecules with stochastic trajectories
Novel sampling techniques for reservoir history matching optimisation and uncertainty quantification in flow prediction
Modern reservoir management has an increasing focus on accurately predicting the likely range of field recoveries. A variety of assisted history matching techniques has been developed across the research community concerned with this topic. These techniques are based on obtaining multiple models that closely reproduce the historical flow behaviour of a reservoir. The set of resulted history matched models is then used to quantify uncertainty in predicting the future performance of the reservoir and providing economic evaluations for different field development strategies. The key step in this workflow is to employ algorithms that sample the parameter space in an efficient but appropriate manner. The algorithm choice has an impact on how fast a model is obtained and how well the model fits the production data. The sampling techniques that have been developed to date include, among others, gradient based methods, evolutionary algorithms, and ensemble Kalman filter (EnKF).
This thesis has investigated and further developed the following sampling and inference techniques: Particle Swarm Optimisation (PSO), Hamiltonian Monte Carlo, and Population Markov Chain Monte Carlo. The inspected techniques have the capability of navigating the parameter space and producing history matched models that can be used to quantify the uncertainty in the forecasts in a faster and more reliable way. The analysis of these techniques, compared with Neighbourhood Algorithm (NA), has shown how the different techniques affect the predicted recovery from petroleum systems and the benefits of the developed methods over the NA.
The history matching problem is multi-objective in nature, with the production data possibly consisting of multiple types, coming from different wells, and collected at different times. Multiple objectives can be constructed from these data and explicitly be
optimised in the multi-objective scheme. The thesis has extended the PSO to handle multi-objective history matching problems in which a number of possible conflicting objectives must be satisfied simultaneously. The benefits and efficiency of innovative multi-objective particle swarm scheme (MOPSO) are demonstrated for synthetic reservoirs. It is demonstrated that the MOPSO procedure can provide a substantial improvement in finding a diverse set of good fitting models with a fewer number of very costly forward simulations runs than the standard single objective case, depending on how the objectives are constructed.
The thesis has also shown how to tackle a large number of unknown parameters through the coupling of high performance global optimisation algorithms, such as PSO, with model reduction techniques such as kernel principal component analysis (PCA), for parameterising spatially correlated random fields. The results of the PSO-PCA coupling applied to a recent SPE benchmark history matching problem have demonstrated that the approach is indeed applicable for practical problems. A comparison of PSO with the EnKF data assimilation method has been carried out and has concluded that both methods have obtained comparable results on the example case. This point reinforces the need for using a range of assisted history matching algorithms for more confidence in predictions
Learning Inference Models for Computer Vision
Computer vision can be understood as the ability to perform 'inference' on image data. Breakthroughs in computer vision technology are often marked by advances in inference techniques, as even the model design is often dictated by the complexity of inference in them. This thesis proposes learning based inference schemes and demonstrates applications in computer vision. We propose techniques for inference in both generative and discriminative computer vision models.
Despite their intuitive appeal, the use of generative models in vision is hampered by the difficulty of posterior inference, which is often too complex or too slow to be practical. We propose techniques for improving inference in two widely used techniques: Markov Chain Monte Carlo (MCMC) sampling and message-passing inference. Our inference strategy is to learn separate discriminative models that assist Bayesian inference in a generative model. Experiments on a range of generative vision models show that the proposed techniques accelerate the inference process and/or converge to better solutions.
A main complication in the design of discriminative models is the inclusion of prior knowledge in a principled way. For better inference in discriminative models, we propose techniques that modify the original model itself, as inference is simple evaluation of the model. We concentrate on convolutional neural network (CNN) models and propose a generalization of standard spatial convolutions, which are the basic building blocks of CNN architectures, to bilateral convolutions. First, we generalize the existing use of bilateral filters and then propose new neural network architectures with learnable bilateral filters, which we call `Bilateral Neural Networks'. We show how the bilateral filtering modules can be used for modifying existing CNN architectures for better image
segmentation and propose a neural network approach for temporal information propagation in videos. Experiments demonstrate the potential of the proposed bilateral networks on a wide range of vision tasks and datasets.
In summary, we propose learning based techniques for better inference in several computer vision models ranging from inverse graphics to freely parameterized neural networks. In generative vision models, our inference techniques alleviate some of the crucial hurdles in Bayesian posterior inference, paving new ways for the use of model based machine learning in vision. In discriminative CNN models, the proposed filter generalizations aid in the design of new neural network architectures that can handle sparse high-dimensional data as well as provide a way for incorporating prior knowledge into CNNs