13 research outputs found
Recommended from our members
Formally justified and modular Bayesian inference for probabilistic programs
Probabilistic modelling offers a simple and coherent framework to describe the
real world in the face of uncertainty. Furthermore, by applying Bayes' rule
it is possible to use probabilistic models to make inferences about the state of
the world from partial observations. While traditionally probabilistic models
were constructed on paper, more recently the approach of probabilistic
programming enables users to write the models in executable languages resembling
computer programs and to freely mix them with deterministic code.
It has long been recognised that the semantics of programming languages is
complicated and the intuitive understanding that programmers have is often
inaccurate, resulting in difficult to understand bugs and unexpected program
behaviours. Programming languages are therefore studied in a rigorous way using
formal languages with mathematically defined semantics. Traditionally formal
semantics of probabilistic programs are defined using exact inference results,
but in practice exact Bayesian inference is not tractable and approximate
methods are used instead, posing a question of how the results of these
algorithms relate to the exact results. Correctness of such approximate methods
is usually argued somewhat less rigorously, without reference to a formal
semantics.
In this dissertation we formally develop denotational semantics for
probabilistic programs that correspond to popular sampling algorithms often used
in practice. The semantics is defined for an expressive typed lambda calculus
with higher-order functions and inductive types, extended with probabilistic
effects for sampling and conditioning, allowing continuous distributions and
unbounded likelihoods. It makes crucial use of the recently developed formalism
of quasi-Borel spaces to bring all these elements together. We provide semantics
corresponding to several variants of Markov chain Monte Carlo and Sequential
Monte Carlo methods and formally prove a notion of correctness for these
algorithms in the context of probabilistic programming.
We also show that the semantic construction can be directly mapped to an
implementation using established functional programming abstractions called
monad transformers. We develop a compact Haskell library for probabilistic
programming closely corresponding to the semantic construction, giving users a
high level of assurance in the correctness of the implementation. We also
demonstrate on a collection of benchmarks that the library offers performance
competitive with existing systems of similar scope.
An important property of our construction, both the semantics and the
implementation, is the high degree of modularity it offers. All the inference
algorithms are constructed by combining small building blocks in a setup where
the type system ensures correctness of compositions. We show that with basic
building blocks corresponding to vanilla Metropolis-Hastings and Sequential
Monte Carlo we can implement more advanced algorithms known in the literature,
such as Resample-Move Sequential Monte Carlo, Particle Marginal
Metropolis-Hastings, and Sequential Monte Carlo squared. These implementations
are very concise, reducing the effort required to produce them and the scope for
bugs. On top of that, our modular construction enables in some cases
deterministic testing of randomised inference algorithms, further increasing
reliability of the implementation.Engineering and Physical Sciences Research Council, Cambridge Trust, Cambridge-Tuebingen programm
Comparative analysis of React, Next and Gatsby programming frameworks for creating SPA applications
This article presents a performance analysis of some of the most popular development frameworks based on the React library. The aim of the study was to show which of the technologies used to create the visual parts of web applications is the most efficient. The research was conducted with the use of 3 applications representing the same research content but based on the above-mentioned frontend technologies. In order to evaluate the performance, web browser development tools and React library were used, which proved that vanilla React is the most efficient for rendering pages with a lot of data
Deep Probabilistic Surrogate Networks for Universal Simulator Approximation
We present a framework for automatically structuring and training fast,
approximate, deep neural surrogates of existing stochastic simulators. Unlike
traditional approaches to surrogate modeling, our surrogates retain the
interpretable structure of the reference simulators. The particular way we
achieve this allows us to replace the reference simulator with the surrogate
when undertaking amortized inference in the probabilistic programming sense.
The fidelity and speed of our surrogates allow for not only faster "forward"
stochastic simulation but also for accurate and substantially faster inference.
We support these claims via experiments that involve a commercial
composite-materials curing simulator. Employing our surrogate modeling
technique makes inference an order of magnitude faster, opening up the
possibility of doing simulator-based, non-invasive, just-in-time parts quality
testing; in this case inferring safety-critical latent internal temperature
profiles of composite materials undergoing curing from surface temperature
profile measurements
Video Killed the HD-Map: Predicting Driving Behavior Directly From Drone Images
The development of algorithms that learn behavioral driving models using
human demonstrations has led to increasingly realistic simulations. In general,
such models learn to jointly predict trajectories for all controlled agents by
exploiting road context information such as drivable lanes obtained from
manually annotated high-definition (HD) maps. Recent studies show that these
models can greatly benefit from increasing the amount of human data available
for training. However, the manual annotation of HD maps which is necessary for
every new location puts a bottleneck on efficiently scaling up human traffic
datasets. We propose a drone birdview image-based map (DBM) representation that
requires minimal annotation and provides rich road context information. We
evaluate multi-agent trajectory prediction using the DBM by incorporating it
into a differentiable driving simulator as an image-texture-based
differentiable rendering module. Our results demonstrate competitive
multi-agent trajectory prediction performance when using our DBM representation
as compared to models trained with rasterized HD maps
Amortized Rejection Sampling in Universal Probabilistic Programming
Existing approaches to amortized inference in probabilistic programs with
unbounded loops can produce estimators with infinite variance. An instance of
this is importance sampling inference in programs that explicitly include
rejection sampling as part of the user-programmed generative procedure. In this
paper we develop a new and efficient amortized importance sampling estimator.
We prove finite variance of our estimator and empirically demonstrate our
method's correctness and efficiency compared to existing alternatives on
generative programs containing rejection sampling loops and discuss how to
implement our method in a generic probabilistic programming framework
Fabular: regression formulas as probabilistic programming
Regression formulas are a domain-specific language adopted by several R packages for describing an important and useful class of statistical models: hierarchical linear regressions. Formulas are succinct, expressive, and clearly popular, so are they a useful addition to probabilistic programming languages? And what do they mean? We propose a core calculus of hierarchical linear regression, in which regression coefficients are themselves defined by nested regressions (unlike in R). We explain how our calculus captures the essence of the formula DSL found in R. We describe the design and implementation of Fabular, a version of the Tabular schema-driven probabilistic programming language, enriched with formulas based on our regression calculus. To the best of our knowledge, this is the first formal description of the core ideas of R's formula notation, the first development of a calculus of regression formulas, and the first demonstration of the benefits of composing regression formulas and latent variables in a probabilistic programming language.Adam Ścibior received travel support from the DARPA PPAML programme. Marcin Szymczak was supported by Microsoft Research through its PhD Scholarship Programme.This is the author accepted manuscript. The final version is available from the Association of Computer Machinery via http://dx.doi.org/10.1145/2837614.283765