16,081 research outputs found
Sample-based Search Methods for Bayes-Adaptive Planning
A fundamental issue for control is acting in the face of uncertainty about the environment. Amongst other things, this induces a trade-off between exploration and exploitation. A model-based Bayesian agent optimizes its return by maintaining a posterior distribution over possible environments, and considering all possible future paths. This optimization is equivalent to solving a Markov Decision Process (MDP) whose hyperstate comprises the agent's beliefs about the environment, as well as its current state in that environment. This corresponding process is called a Bayes-Adaptive MDP (BAMDP). Even for MDPs with only a few states, it is generally intractable to solve the corresponding BAMDP exactly. Various heuristics have been devised, but those that are computationally tractable often perform indifferently, whereas those that perform well are typically so expensive as to be applicable only in small domains with limited structure. Here, we develop new tractable methods for planning in BAMDPs based on recent advances in the solution to large MDPs and general partially observable MDPs. Our algorithms are sample-based, plan online in a way that is focused on the current belief, and, critically, avoid expensive belief updates during simulations. In discrete domains, we use Monte-Carlo tree search to search forward in an aggressive manner. The derived algorithm can scale to large MDPs and provably converges to the Bayes-optimal solution asymptotically. We then consider a more general class of simulation-based methods in which approximation methods can be employed to allow value function estimates to generalize between hyperstates during search. This allows us to tackle continuous domains. We validate our approach empirically in standard domains by comparison with existing approximations. Finally, we explore Bayes-adaptive planning in environments that are modelled by rich, non-parametric probabilistic models. We demonstrate that a fully Bayesian agent can be advantageous in the exploration of complex and even infinite, structured domains
Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
Bayesian model-based reinforcement learning is a formally elegant approach to
learning optimal behaviour under model uncertainty, trading off exploration and
exploitation in an ideal way. Unfortunately, finding the resulting
Bayes-optimal policies is notoriously taxing, since the search space becomes
enormous. In this paper we introduce a tractable, sample-based method for
approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our
approach outperformed prior Bayesian model-based RL algorithms by a significant
margin on several well-known benchmark problems -- because it avoids expensive
applications of Bayes rule within the search tree by lazily sampling models
from the current beliefs. We illustrate the advantages of our approach by
showing it working in an infinite state space domain which is qualitatively out
of reach of almost all previous work in Bayesian exploration.Comment: 14 pages, 7 figures, includes supplementary material. Advances in
Neural Information Processing Systems (NIPS) 201
On computational tools for Bayesian data analysis
While Robert and Rousseau (2010) addressed the foundational aspects of
Bayesian analysis, the current chapter details its practical aspects through a
review of the computational methods available for approximating Bayesian
procedures. Recent innovations like Monte Carlo Markov chain, sequential Monte
Carlo methods and more recently Approximate Bayesian Computation techniques
have considerably increased the potential for Bayesian applications and they
have also opened new avenues for Bayesian inference, first and foremost
Bayesian model choice.Comment: This is a chapter for the book "Bayesian Methods and Expert
Elicitation" edited by Klaus Bocker, 23 pages, 9 figure
Bayes Model Selection with Path Sampling: Factor Models and Other Examples
We prove a theorem justifying the regularity conditions which are needed for
Path Sampling in Factor Models. We then show that the remaining ingredient,
namely, MCMC for calculating the integrand at each point in the path, may be
seriously flawed, leading to wrong estimates of Bayes factors. We provide a new
method of Path Sampling (with Small Change) that works much better than
standard Path Sampling in the sense of estimating the Bayes factor better and
choosing the correct model more often. When the more complex factor model is
true, PS-SC is substantially more accurate. New MCMC diagnostics is provided
for these problems in support of our conclusions and recommendations. Some of
our ideas for diagnostics and improvement in computation through small changes
should apply to other methods of computation of the Bayes factor for model
selection.Comment: Published in at http://dx.doi.org/10.1214/12-STS403 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Mixtures of g-priors in Generalized Linear Models
Mixtures of Zellner's g-priors have been studied extensively in linear models
and have been shown to have numerous desirable properties for Bayesian variable
selection and model averaging. Several extensions of g-priors to Generalized
Linear Models (GLMs) have been proposed in the literature; however, the choice
of prior distribution of g and resulting properties for inference have received
considerably less attention. In this paper, we unify mixtures of g-priors in
GLMs by assigning the truncated Compound Confluent Hypergeometric (tCCH)
distribution to 1/(1 + g), which encompasses as special cases several mixtures
of g-priors in the literature, such as the hyper-g, Beta-prime, truncated
Gamma, incomplete inverse-Gamma, benchmark, robust, hyper-g/n, and intrinsic
priors. Through an integrated Laplace approximation, the posterior distribution
of 1/(1 + g) is in turn a tCCH distribution, and approximate marginal
likelihoods are thus available analytically, leading to "Compound
Hypergeometric Information Criteria" for model selection. We discuss the local
geometric properties of the g-prior in GLMs and show how the desiderata for
model selection proposed by Bayarri et al, such as asymptotic model selection
consistency, intrinsic consistency, and measurement invariance may be used to
justify the prior and specific choices of the hyper parameters. We illustrate
inference using these priors and contrast them to other approaches via
simulation and real data examples. The methodology is implemented in the R
package BAS and freely available on CRAN
- ā¦