316 research outputs found

    Advances in Probabilistic Deep Learning

    Get PDF
    This thesis is concerned with methodological advances in probabilistic inference and their application to core challenges in machine perception and AI. Inferring a posterior distribution over the parameters of a model given some data is a central challenge that occurs in many fields ranging from finance and artificial intelligence to physics. Exact calculation is impossible in all but the simplest cases and a rich field of approximate inference has been developed to tackle this challenge. This thesis develops both an advance in approximate inference and an application of these methods to the problem of speech synthesis. In the first section of this thesis we develop a novel framework for constructing Markov Chain Monte Carlo (MCMC) kernels that can efficiently sample from high dimensional distributions such as the posteriors, that frequently occur in machine perception. We provide a specific instance of this framework and demonstrate that it can match or exceed the performance of Hamiltonian Monte Carlo without requiring gradients of the target distribution. In the second section of the thesis we focus on the application of approximate inference techniques to the task of synthesising human speech from text. By using advances in neural variational inference we are able to construct a state of the art speech synthesis system in which it is possible to control aspects of prosody such as emotional expression from significantly less supervised data than previously existing state of the art methods

    Multilevel Delayed Acceptance MCMC with Applications to Hydrogeological Inverse Problems

    Get PDF
    Quantifying the uncertainty of model predictions is a critical task for engineering decision support systems. This is a particularly challenging effort in the context of statistical inverse problems, where the model parameters are unknown or poorly constrained, and where the data is often scarce. Many such problems emerge in the fields of hydrology and hydro--environmental engineering in general, and in hydrogeology in particular. While methods for rigorously quantifying the uncertainty of such problems exist, they are often prohibitively computationally expensive, particularly when the forward model is high--dimensional and expensive to evaluate. In this thesis, I present a Metropolis--Hastings algorithm, namely the Multilevel Delayed Acceptance (MLDA) algorithm, which exploits a hierarchy of forward models of increasing computational cost to significantly reduce the total cost of quantifying the uncertainty of high--dimensional, expensive forward models. The algorithm is shown to be in detailed balance with the posterior distribution of parameters, and the computational gains of the algorithm is demonstrated on multiple examples. Additionally, I present an approach for exploiting a deep neural network as an ultra--fast model approximation in an MLDA model hierarchy. This method is demonstrated in the context of both 2D and 3D groundwater flow modelling. Finally, I present a novel approach to adaptive optimal design of groundwater surveying, in which MLDA is employed to construct the posterior Monte Carlo estimates. This method utilises the posterior uncertainty of the primary problem in conjunction with the expected solution to an adjoint problem to sequentially determine the optimal location of the next datapoint.Engineering and Physical Sciences Research Council (EPSRC)Alan Turing InstituteEngineering and Physical Sciences Research Council (EPSRC

    Simulation-based Inference : From Approximate Bayesian Computation and Particle Methods to Neural Density Estimation

    Get PDF
    This doctoral thesis in computational statistics utilizes both Monte Carlo methods(approximate Bayesian computation and sequential Monte Carlo) and machineĀ­-learning methods (deep learning and normalizing flows) to develop novel algorithms for inferĀ­ence in implicit Bayesian models. Implicit models are those for which calculating the likelihood function is very challenging (and often impossible), but model simulation is feasible. The inference methods developed in the thesis are simulationĀ­-based inferĀ­ence methods since they leverage the possibility to simulate data from the implicit models. Several approaches are considered in the thesis: Paper II and IV focus on classical methods (sequential Monte CarloĀ­-based methods), while paper I and III foĀ­cus on more recent machine learning methods (deep learning and normalizing flows, respectively).Paper I constructs novel deep learning methods for learning summary statistics for approximate Bayesian computation (ABC). To achieve this paper I introduces the partially exchangeable network (PEN), a deep learning architecture specifically deĀ­signed for Markovian data (i.e., partially exchangeable data).Paper II considers Bayesian inference in stochastic differential equation mixed-effects models (SDEMEM). Bayesian inference for SDEMEMs is challenging due to the intractable likelihood function of SDEMEMs. Paper II addresses this problem by designing a novel a GibbsĀ­-blocking strategy in combination with correlated pseudoĀ­ marginal methods. The paper also discusses how custom particle filters can be adapted to the inference procedure.Paper III introduces the novel inference method sequential neural posterior and likeĀ­lihood approximation (SNPLA). SNPLA is a simulationĀ­-based inference algorithm that utilizes normalizing flows for learning both the posterior distribution and the likelihood function of an implicit model via a sequential scheme. By learning both the likelihood and the posterior, and by leveraging the reverse Kullback Leibler (KL) divergence, SNPLA avoids adĀ­-hoc correction steps and Markov chain Monte Carlo (MCMC) sampling.Paper IV introduces the accelerated-delayed acceptance (ADA) algorithm. ADA can be viewed as an extension of the delayedĀ­-acceptance (DA) MCMC algorithm that leverages connections between the two likelihood ratios of DA to further accelerate MCMC sampling from the posterior distribution of interest, although our approach introduces an approximation. The main case study of paper IV is a doubleĀ­-well poĀ­tential stochastic differential equation (DWPĀ­SDE) model for protein-Ā­folding data (reaction coordinate data)

    Learning visual representations with neural networks for video captioning and image generation

    Full text link
    La recherche sur les reĢseaux de neurones a permis de reĢaliser de larges progreĢ€s durant la dernieĢ€re deĢcennie. Non seulement les reĢseaux de neurones ont eĢteĢ appliqueĢs avec succeĢ€s pour reĢsoudre des probleĢ€mes de plus en plus complexes; mais ils sont aussi devenus lā€™approche dominante dans les domaines ouĢ€ ils ont eĢteĢ testeĢs tels que la compreĢhension du langage, les agents jouant aĢ€ des jeux de manieĢ€re automatique ou encore la vision par ordinateur, graĢ‚ce aĢ€ leurs capaciteĢs calculatoires et leurs efficaciteĢs statistiques. La preĢsente theĢ€se eĢtudie les reĢseaux de neurones appliqueĢs aĢ€ des probleĢ€mes en vision par ordinateur, ouĢ€ les repreĢsentations seĢmantiques abstraites jouent un roĢ‚le fondamental. Nous deĢmontrerons, aĢ€ la fois par la theĢorie et par lā€™expeĢrimentation, la capaciteĢ des reĢseaux de neurones aĢ€ apprendre de telles repreĢsentations aĢ€ partir de donneĢes, avec ou sans supervision. Le contenu de la theĢ€se est diviseĢ en deux parties. La premieĢ€re partie eĢtudie les reĢseaux de neurones appliqueĢs aĢ€ la description de videĢo en langage naturel, neĢcessitant lā€™apprentissage de repreĢsentation visuelle. Le premier modeĢ€le proposeĢ permet dā€™avoir une attention dynamique sur les diffeĢrentes trames de la videĢo lors de la geĢneĢration de la description textuelle pour de courtes videĢos. Ce modeĢ€le est ensuite ameĢlioreĢ par lā€™introduction dā€™une opeĢration de convolution reĢcurrente. Par la suite, la dernieĢ€re section de cette partie identifie un probleĢ€me fondamental dans la description de videĢo en langage naturel et propose un nouveau type de meĢtrique dā€™eĢvaluation qui peut eĢ‚tre utiliseĢ empiriquement comme un oracle afin dā€™analyser les performances de modeĢ€les concernant cette taĢ‚che. La deuxieĢ€me partie se concentre sur lā€™apprentissage non-superviseĢ et eĢtudie une famille de modeĢ€les capables de geĢneĢrer des images. En particulier, lā€™accent est mis sur les ā€œNeural Autoregressive Density Estimators (NADEs), une famille de modeĢ€les probabilistes pour les images naturelles. Ce travail met tout dā€™abord en eĢvidence une connection entre les modeĢ€les NADEs et les reĢseaux stochastiques geĢneĢratifs (GSN). De plus, une ameĢlioration des modeĢ€les NADEs standards est proposeĢe. DeĢnommeĢs NADEs iteĢratifs, cette ameĢlioration introduit plusieurs iteĢrations lors de lā€™infeĢrence du modeĢ€le NADEs tout en preĢservant son nombre de parameĢ€tres. DeĢbutant par une revue chronologique, ce travail se termine par un reĢsumeĢ des reĢcents deĢveloppements en lien avec les contributions preĢsenteĢes dans les deux parties principales, concernant les probleĢ€mes dā€™apprentissage de repreĢsentation seĢmantiques pour les images et les videĢos. De prometteuses directions de recherche sont envisageĢes.The past decade has been marked as a golden era of neural network research. Not only have neural networks been successfully applied to solve more and more challenging real- world problems, but also they have become the dominant approach in many of the places where they have been tested. These places include, for instance, language understanding, game playing, and computer vision, thanks to neural networksā€™ superiority in computational efficiency and statistical capacity. This thesis applies neural networks to problems in computer vision where high-level and semantically meaningful representations play a fundamental role. It demonstrates both in theory and in experiment the ability to learn such representations from data with and without supervision. The main content of the thesis is divided into two parts. The first part studies neural networks in the context of learning visual representations for the task of video captioning. Models are developed to dynamically focus on different frames while generating a natural language description of a short video. Such a model is further improved by recurrent convolutional operations. The end of this part identifies fundamental challenges in video captioning and proposes a new type of evaluation metric that may be used experimentally as an oracle to benchmark performance. The second part studies the family of models that generate images. While the first part is supervised, this part is unsupervised. The focus of it is the popular family of Neural Autoregressive Density Estimators (NADEs), a tractable probabilistic model for natural images. This work first makes a connection between NADEs and Generative Stochastic Networks (GSNs). The standard NADE is improved by introducing multiple iterations in its inference without increasing the number of parameters, which is dubbed iterative NADE. With a historical view at the beginning, this work ends with a summary of recent development for work discussed in the first two parts around the central topic of learning visual representations for images and videos. A bright future is envisioned at the end

    The estimation and application of unnormalized statistical models

    Get PDF

    Model predictive path integral control: Theoretical foundations and applications to autonomous driving

    Get PDF
    This thesis presents a new approach for stochastic model predictive (optimal) control: model predictive path integral control, which is based on massive parallel sampling of control trajectories. We ļ¬rst show the theoretical foundations of model predictive path integral control, which are based on a combination of path integral control theory and an information theoretic interpretation of stochastic optimal control. We then apply the method to high speed autonomous driving on a 1/5 scale vehicle and analyze the performance and robustness of the method. Extensive experimental results are used to identify and solve key problems relating to robustness of the approach, which leads to a robust stochastic model predictive control algorithm capable of consistently pushing the limits of performance on the 1/5 scale vehicle.Ph.D
    • ā€¦
    corecore