65 research outputs found

    Learning and policy search in stochastic dynamical systems with Bayesian neural networks

    Get PDF
    We present an algorithm for policy search in stochastic dynamical systems using model-based reinforcement learning. The system dynamics are described with Bayesian neural networks (BNNs) that include stochastic input variables. These input variables allow us to capture complex statistical patterns in the transition dynamics (e.g. multi-modality and heteroskedasticity), which are usually missed by alternative modeling approaches. After learning the dynamics, our BNNs are then fed into an algorithm that performs random roll-outs and uses stochastic optimization for policy learning. We train our BNNs by minimizing a-divergences with a = 0.5, which usually produces better results than other techniques such as variational Bayes. We illustrate the performance of our method by solving a challenging problem where model-based approaches usually fail and by obtaining promising results in real-world scenarios including the control of a gas turbine and an industrial benchmark

    Accelerated sampling for the Indian Buffet Process.

    No full text

    Large scale nonparametric Bayesian inference:data parallelisation an the Indian buffet process

    No full text
    Nonparametric Bayesian models provide a framework for flexible probabilistic modelling of complex datasets. Unfortunately, the high-dimensional averages re-quired for Bayesian methods can be slow, especially with the unbounded repre-sentations used by nonparametric models. We address the challenge of scaling Bayesian inference to the increasingly large datasets found in real-world appli-cations. We focus on parallelisation of inference in the Indian Buffet Process (IBP), which allows data points to have an unbounded number of sparse latent features. Our novel MCMC sampler divides a large data set between multiple processors and uses message passing to compute the global likelihoods and pos-teriors. This algorithm, the first parallel inference scheme for IBP-based models, scales to datasets orders of magnitude larger than have previously been possible.
    corecore