Search CORE

316 research outputs found

Advances in Probabilistic Deep Learning

Author: Habib Raza
Publication venue: UCL (University College London)
Publication date: 28/12/2022
Field of study

This thesis is concerned with methodological advances in probabilistic inference and their application to core challenges in machine perception and AI. Inferring a posterior distribution over the parameters of a model given some data is a central challenge that occurs in many fields ranging from finance and artificial intelligence to physics. Exact calculation is impossible in all but the simplest cases and a rich field of approximate inference has been developed to tackle this challenge. This thesis develops both an advance in approximate inference and an application of these methods to the problem of speech synthesis. In the first section of this thesis we develop a novel framework for constructing Markov Chain Monte Carlo (MCMC) kernels that can efficiently sample from high dimensional distributions such as the posteriors, that frequently occur in machine perception. We provide a specific instance of this framework and demonstrate that it can match or exceed the performance of Hamiltonian Monte Carlo without requiring gradients of the target distribution. In the second section of the thesis we focus on the application of approximate inference techniques to the task of synthesising human speech from text. By using advances in neural variational inference we are able to construct a state of the art speech synthesis system in which it is possible to control aspects of prosody such as emotional expression from significantly less supervised data than previously existing state of the art methods

UCL Discovery

Multilevel Delayed Acceptance MCMC with Applications to Hydrogeological Inverse Problems

Author: Lykkegaard M
Publication venue: 'Illuminating Engineering Society of Japan'
Publication date: 30/08/2022
Field of study

Quantifying the uncertainty of model predictions is a critical task for engineering decision support systems. This is a particularly challenging effort in the context of statistical inverse problems, where the model parameters are unknown or poorly constrained, and where the data is often scarce. Many such problems emerge in the fields of hydrology and hydro--environmental engineering in general, and in hydrogeology in particular. While methods for rigorously quantifying the uncertainty of such problems exist, they are often prohibitively computationally expensive, particularly when the forward model is high--dimensional and expensive to evaluate. In this thesis, I present a Metropolis--Hastings algorithm, namely the Multilevel Delayed Acceptance (MLDA) algorithm, which exploits a hierarchy of forward models of increasing computational cost to significantly reduce the total cost of quantifying the uncertainty of high--dimensional, expensive forward models. The algorithm is shown to be in detailed balance with the posterior distribution of parameters, and the computational gains of the algorithm is demonstrated on multiple examples. Additionally, I present an approach for exploiting a deep neural network as an ultra--fast model approximation in an MLDA model hierarchy. This method is demonstrated in the context of both 2D and 3D groundwater flow modelling. Finally, I present a novel approach to adaptive optimal design of groundwater surveying, in which MLDA is employed to construct the posterior Monte Carlo estimates. This method utilises the posterior uncertainty of the primary problem in conjunction with the expected solution to an adjoint problem to sequentially determine the optimal location of the next datapoint.Engineering and Physical Sciences Research Council (EPSRC)Alan Turing InstituteEngineering and Physical Sciences Research Council (EPSRC

Open Research Exeter

Simulation-based Inference : From Approximate Bayesian Computation and Particle Methods to Neural Density Estimation

Author: Wiqvist Samuel
Publication venue: Lund University (Media-Tryck)
Publication date: 16/08/2021
Field of study

This doctoral thesis in computational statistics utilizes both Monte Carlo methods(approximate Bayesian computation and sequential Monte Carlo) and machine-learning methods (deep learning and normalizing flows) to develop novel algorithms for inference in implicit Bayesian models. Implicit models are those for which calculating the likelihood function is very challenging (and often impossible), but model simulation is feasible. The inference methods developed in the thesis are simulation-based inference methods since they leverage the possibility to simulate data from the implicit models. Several approaches are considered in the thesis: Paper II and IV focus on classical methods (sequential Monte Carlo-based methods), while paper I and III focus on more recent machine learning methods (deep learning and normalizing flows, respectively).Paper I constructs novel deep learning methods for learning summary statistics for approximate Bayesian computation (ABC). To achieve this paper I introduces the partially exchangeable network (PEN), a deep learning architecture specifically designed for Markovian data (i.e., partially exchangeable data).Paper II considers Bayesian inference in stochastic differential equation mixed-effects models (SDEMEM). Bayesian inference for SDEMEMs is challenging due to the intractable likelihood function of SDEMEMs. Paper II addresses this problem by designing a novel a Gibbs-blocking strategy in combination with correlated pseudo marginal methods. The paper also discusses how custom particle filters can be adapted to the inference procedure.Paper III introduces the novel inference method sequential neural posterior and likelihood approximation (SNPLA). SNPLA is a simulation-based inference algorithm that utilizes normalizing flows for learning both the posterior distribution and the likelihood function of an implicit model via a sequential scheme. By learning both the likelihood and the posterior, and by leveraging the reverse Kullback Leibler (KL) divergence, SNPLA avoids ad-hoc correction steps and Markov chain Monte Carlo (MCMC) sampling.Paper IV introduces the accelerated-delayed acceptance (ADA) algorithm. ADA can be viewed as an extension of the delayed-acceptance (DA) MCMC algorithm that leverages connections between the two likelihood ratios of DA to further accelerate MCMC sampling from the posterior distribution of interest, although our approach introduces an approximation. The main case study of paper IV is a double-well potential stochastic differential equation (DWPSDE) model for protein-folding data (reaction coordinate data)

Lund University Publications

Recommended from our members

Generative Modeling and Inference in Directed and Undirected Neural Networks

Author: Stinson Patrick
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Generative modeling and inference are two broad categories in unsupervised learning whose goal is to answer the following questions, respectively: 1. Given a dataset, how do we (either implicitly or explicitly) model the underlying probability distribution from which the data came and draw samples from that distribution? 2. How can we learn an underlying abstract representation of the data? In this dissertation we provide three studies that each in a different way improve upon specific generative modeling and inference techniques. First, we develop a state-of-the-art estimator of a generic probability distribution's partition function, or normalizing constant, during simulated tempering. We then apply our estimator to the specific case of training undirected probabilistic graphical models and find our method able to track log-likelihoods during training at essentially no extra computational cost. We then shift our focus to variational inference in directed probabilistic graphical models (Bayesian networks) for generative modeling and inference. First, we generalize the aggregate prior distribution to decouple the variational and generative models to provide the model with greater flexibility and find improvements in the model's log-likelihood of test data as well as a better latent representation. Finally, we study the variational loss function and argue under a typical architecture the data-dependent term of the gradient decays to zero as the latent space dimensionality increases. We use this result to propose a simple modification to random weight initialization and show in certain models the modification gives rise to substantial improvement in training convergence time. Together, these results improve quantitative performance of popular generative modeling and inference models in addition to furthering our understanding of them

Columbia University Academic Commons

Learning visual representations with neural networks for video captioning and image generation

Author: Yao Li
Publication venue
Publication date: 01/12/2017
Field of study

La recherche sur les réseaux de neurones a permis de réaliser de larges progrès durant la dernière décennie. Non seulement les réseaux de neurones ont été appliqués avec succès pour résoudre des problèmes de plus en plus complexes; mais ils sont aussi devenus l’approche dominante dans les domaines où ils ont été testés tels que la compréhension du langage, les agents jouant à des jeux de manière automatique ou encore la vision par ordinateur, grâce à leurs capacités calculatoires et leurs efficacités statistiques. La présente thèse étudie les réseaux de neurones appliqués à des problèmes en vision par ordinateur, où les représentations sémantiques abstraites jouent un rôle fondamental. Nous démontrerons, à la fois par la théorie et par l’expérimentation, la capacité des réseaux de neurones à apprendre de telles représentations à partir de données, avec ou sans supervision. Le contenu de la thèse est divisé en deux parties. La première partie étudie les réseaux de neurones appliqués à la description de vidéo en langage naturel, nécessitant l’apprentissage de représentation visuelle. Le premier modèle proposé permet d’avoir une attention dynamique sur les différentes trames de la vidéo lors de la génération de la description textuelle pour de courtes vidéos. Ce modèle est ensuite amélioré par l’introduction d’une opération de convolution récurrente. Par la suite, la dernière section de cette partie identifie un problème fondamental dans la description de vidéo en langage naturel et propose un nouveau type de métrique d’évaluation qui peut être utilisé empiriquement comme un oracle afin d’analyser les performances de modèles concernant cette tâche. La deuxième partie se concentre sur l’apprentissage non-supervisé et étudie une famille de modèles capables de générer des images. En particulier, l’accent est mis sur les “Neural Autoregressive Density Estimators (NADEs), une famille de modèles probabilistes pour les images naturelles. Ce travail met tout d’abord en évidence une connection entre les modèles NADEs et les réseaux stochastiques génératifs (GSN). De plus, une amélioration des modèles NADEs standards est proposée. Dénommés NADEs itératifs, cette amélioration introduit plusieurs itérations lors de l’inférence du modèle NADEs tout en préservant son nombre de paramètres. Débutant par une revue chronologique, ce travail se termine par un résumé des récents développements en lien avec les contributions présentées dans les deux parties principales, concernant les problèmes d’apprentissage de représentation sémantiques pour les images et les vidéos. De prometteuses directions de recherche sont envisagées.The past decade has been marked as a golden era of neural network research. Not only have neural networks been successfully applied to solve more and more challenging real- world problems, but also they have become the dominant approach in many of the places where they have been tested. These places include, for instance, language understanding, game playing, and computer vision, thanks to neural networks’ superiority in computational efficiency and statistical capacity. This thesis applies neural networks to problems in computer vision where high-level and semantically meaningful representations play a fundamental role. It demonstrates both in theory and in experiment the ability to learn such representations from data with and without supervision. The main content of the thesis is divided into two parts. The first part studies neural networks in the context of learning visual representations for the task of video captioning. Models are developed to dynamically focus on different frames while generating a natural language description of a short video. Such a model is further improved by recurrent convolutional operations. The end of this part identifies fundamental challenges in video captioning and proposes a new type of evaluation metric that may be used experimentally as an oracle to benchmark performance. The second part studies the family of models that generate images. While the first part is supervised, this part is unsupervised. The focus of it is the popular family of Neural Autoregressive Density Estimators (NADEs), a tractable probabilistic model for natural images. This work first makes a connection between NADEs and Generative Stochastic Networks (GSNs). The standard NADE is improved by introducing multiple iterations in its inference without increasing the number of parameters, which is dubbed iterative NADE. With a historical view at the beginning, this work ends with a summary of recent development for work discussed in the first two parts around the central topic of learning visual representations for images and videos. A bright future is envisioned at the end

Dépôt Institutionnel Numérique

The estimation and application of unnormalized statistical models

Author: Brakel Philémon
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography

Model predictive path integral control: Theoretical foundations and applications to autonomous driving

Author: Williams Grady Robert
Publication venue: Georgia Institute of Technology
Publication date: 20/05/2020
Field of study

This thesis presents a new approach for stochastic model predictive (optimal) control: model predictive path integral control, which is based on massive parallel sampling of control trajectories. We ﬁrst show the theoretical foundations of model predictive path integral control, which are based on a combination of path integral control theory and an information theoretic interpretation of stochastic optimal control. We then apply the method to high speed autonomous driving on a 1/5 scale vehicle and analyze the performance and robustness of the method. Extensive experimental results are used to identify and solve key problems relating to robustness of the approach, which leads to a robust stochastic model predictive control algorithm capable of consistently pushing the limits of performance on the 1/5 scale vehicle.Ph.D

Scholarly Materials And Research @ Georgia Tech