773 research outputs found

    A Stein variational Newton method

    Full text link
    Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.Comment: 18 pages, 7 figure

    Probabilistic learning and computation in brains and machines

    Get PDF
    Humans and animals are able to solve a wide variety of perceptual, decision making and motor tasks with great exibility. Moreover, behavioural evidence shows that this exibility extends to situations where accuracy requires the correct treatment of uncertainty induced by noise and ambiguity in the available sensory information as well as noise internal to the brain. It has been suggested that this adequate handling of uncertainty is based on a learned internal model, e.g. in the case of perception, a generative model of sensory observations. Learning latent variable models and performing inference in them is a key challenge for both biological and arti cial learning systems. Here, we introduce a new approach to learning in hierarchical latent variable models called the Distributed Distributional Code Helmholtz Machine (DDC-HM), which emphasises exibility and accuracy in the inferential process. The approximate posterior over unobserved variables is represented implicitly as a set of expectations, corresponding to mean parameters of an exponential family distribution. To train the generative and recognition models we develop an extended wake-sleep algorithm inspired by the original Helmholtz Machine. As a result, the DDC-HM is able to learn hierarchical latent models without having to propagate gradients across di erent stochastic layers|making our approach biologically appealing. In the second part of the thesis, we review existing proposals for neural representations of uncertainty with a focus on representational and computational exibility as well as experimental support. Finally, we consider inference and learning in dynamical environment models using Distributed Distributional Codes to represent both the stochastic latent transition model and the inferred posterior distributions. We show that this model makes it possible to generalise successor representations to biologically more realistic, partially observed settings
    • …
    corecore