16,081 research outputs found

    Sample-based Search Methods for Bayes-Adaptive Planning

    Get PDF
    A fundamental issue for control is acting in the face of uncertainty about the environment. Amongst other things, this induces a trade-off between exploration and exploitation. A model-based Bayesian agent optimizes its return by maintaining a posterior distribution over possible environments, and considering all possible future paths. This optimization is equivalent to solving a Markov Decision Process (MDP) whose hyperstate comprises the agent's beliefs about the environment, as well as its current state in that environment. This corresponding process is called a Bayes-Adaptive MDP (BAMDP). Even for MDPs with only a few states, it is generally intractable to solve the corresponding BAMDP exactly. Various heuristics have been devised, but those that are computationally tractable often perform indifferently, whereas those that perform well are typically so expensive as to be applicable only in small domains with limited structure. Here, we develop new tractable methods for planning in BAMDPs based on recent advances in the solution to large MDPs and general partially observable MDPs. Our algorithms are sample-based, plan online in a way that is focused on the current belief, and, critically, avoid expensive belief updates during simulations. In discrete domains, we use Monte-Carlo tree search to search forward in an aggressive manner. The derived algorithm can scale to large MDPs and provably converges to the Bayes-optimal solution asymptotically. We then consider a more general class of simulation-based methods in which approximation methods can be employed to allow value function estimates to generalize between hyperstates during search. This allows us to tackle continuous domains. We validate our approach empirically in standard domains by comparison with existing approximations. Finally, we explore Bayes-adaptive planning in environments that are modelled by rich, non-parametric probabilistic models. We demonstrate that a fully Bayesian agent can be advantageous in the exploration of complex and even infinite, structured domains

    Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

    Full text link
    Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.Comment: 14 pages, 7 figures, includes supplementary material. Advances in Neural Information Processing Systems (NIPS) 201

    On computational tools for Bayesian data analysis

    Full text link
    While Robert and Rousseau (2010) addressed the foundational aspects of Bayesian analysis, the current chapter details its practical aspects through a review of the computational methods available for approximating Bayesian procedures. Recent innovations like Monte Carlo Markov chain, sequential Monte Carlo methods and more recently Approximate Bayesian Computation techniques have considerably increased the potential for Bayesian applications and they have also opened new avenues for Bayesian inference, first and foremost Bayesian model choice.Comment: This is a chapter for the book "Bayesian Methods and Expert Elicitation" edited by Klaus Bocker, 23 pages, 9 figure

    Bayes Model Selection with Path Sampling: Factor Models and Other Examples

    Full text link
    We prove a theorem justifying the regularity conditions which are needed for Path Sampling in Factor Models. We then show that the remaining ingredient, namely, MCMC for calculating the integrand at each point in the path, may be seriously flawed, leading to wrong estimates of Bayes factors. We provide a new method of Path Sampling (with Small Change) that works much better than standard Path Sampling in the sense of estimating the Bayes factor better and choosing the correct model more often. When the more complex factor model is true, PS-SC is substantially more accurate. New MCMC diagnostics is provided for these problems in support of our conclusions and recommendations. Some of our ideas for diagnostics and improvement in computation through small changes should apply to other methods of computation of the Bayes factor for model selection.Comment: Published in at http://dx.doi.org/10.1214/12-STS403 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Mixtures of g-priors in Generalized Linear Models

    Full text link
    Mixtures of Zellner's g-priors have been studied extensively in linear models and have been shown to have numerous desirable properties for Bayesian variable selection and model averaging. Several extensions of g-priors to Generalized Linear Models (GLMs) have been proposed in the literature; however, the choice of prior distribution of g and resulting properties for inference have received considerably less attention. In this paper, we unify mixtures of g-priors in GLMs by assigning the truncated Compound Confluent Hypergeometric (tCCH) distribution to 1/(1 + g), which encompasses as special cases several mixtures of g-priors in the literature, such as the hyper-g, Beta-prime, truncated Gamma, incomplete inverse-Gamma, benchmark, robust, hyper-g/n, and intrinsic priors. Through an integrated Laplace approximation, the posterior distribution of 1/(1 + g) is in turn a tCCH distribution, and approximate marginal likelihoods are thus available analytically, leading to "Compound Hypergeometric Information Criteria" for model selection. We discuss the local geometric properties of the g-prior in GLMs and show how the desiderata for model selection proposed by Bayarri et al, such as asymptotic model selection consistency, intrinsic consistency, and measurement invariance may be used to justify the prior and specific choices of the hyper parameters. We illustrate inference using these priors and contrast them to other approaches via simulation and real data examples. The methodology is implemented in the R package BAS and freely available on CRAN
    • ā€¦
    corecore