Search CORE

35,039 research outputs found

Cover Tree Bayesian Reinforcement Learning

Author: Blekas Konstantinos
Dimitrakakis Christos
Tziortziotis Nikolaos
Publication venue
Publication date: 08/12/2013
Field of study

This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration policies in unknown environments. The flexibility and computational simplicity of the model render it suitable for many reinforcement learning problems in continuous state spaces. We demonstrate this in an experimental comparison with least squares policy iteration

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Chalmers Research

Chalmers Publication Library

Mondrian Forests for Large-Scale Regression when Uncertainty Matters

Author: Lakshminarayanan Balaji
Roy Daniel M.
Teh Yee Whye
Publication venue
Publication date: 01/01/2016
Field of study

Many real-world regression problems demand a measure of the uncertainty associated with each prediction. Standard decision forests deliver efficient state-of-the-art predictive performance, but high-quality uncertainty estimates are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but scaling GPs to large-scale data sets comes at the cost of approximating the uncertainty estimates. We extend Mondrian forests, first proposed by Lakshminarayanan et al. (2014) for classification problems, to the large-scale non-parametric regression setting. Using a novel hierarchical Gaussian prior that dovetails with the Mondrian forest framework, we obtain principled uncertainty estimates, while still retaining the computational advantages of decision forests. Through a combination of illustrative examples, real-world large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that Mondrian forests outperform approximate GPs on large-scale regression tasks and deliver better-calibrated uncertainty assessments than decision-forest-based methods.Comment: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume 5

arXiv.org e-Print Archive

Oxford University Research Archive

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

Author: Dayan Peter
Guez Arthur
Silver David
Publication venue
Publication date: 01/01/2012
Field of study

Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.Comment: 14 pages, 7 figures, includes supplementary material. Advances in Neural Information Processing Systems (NIPS) 201

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

A Bayesian Ensemble Regression Framework on the Angry Birds Game

Author: Blekas Konstantinos
Papagiannis Georgios
Tziortziotis Nikolaos
Publication venue
Publication date: 25/08/2014
Field of study

An ensemble inference mechanism is proposed on the Angry Birds domain. It is based on an efficient tree structure for encoding and representing game screenshots, where it exploits its enhanced modeling capability. This has the advantage to establish an informative feature space and modify the task of game playing to a regression analysis problem. To this direction, we assume that each type of object material and bird pair has its own Bayesian linear regression model. In this way, a multi-model regression framework is designed that simultaneously calculates the conditional expectations of several objects and makes a target decision through an ensemble of regression models. Learning procedure is performed according to an online estimation strategy for the model parameters. We provide comparative experimental results on several game levels that empirically illustrate the efficiency of the proposed methodology.Comment: Angry Birds AI Symposium, ECAI 201

arXiv.org e-Print Archive

CiteSeerX

Efficient Bayesian Social Learning on Trees

Author: Kanoria Yashodhan
Tamuz Omer
Publication venue
Publication date: 01/02/2011
Field of study

We consider a set of agents who are attempting to iteratively learn the 'state of the world' from their neighbors in a social network. Each agent initially receives a noisy observation of the true state of the world. The agents then repeatedly 'vote' and observe the votes of some of their peers, from which they gain more information. The agents' calculations are Bayesian and aim to myopically maximize the expected utility at each iteration. This model, introduced by Gale and Kariv (2003), is a natural approach to learning on networks. However, it has been criticized, chiefly because the agents' decision rule appears to become computationally intractable as the number of iterations advances. For instance, a dynamic programming approach (part of this work) has running time that is exponentially large in \min(n, (d-1)^t), where n is the number of agents. We provide a new algorithm to perform the agents' computations on locally tree-like graphs. Our algorithm uses the dynamic cavity method to drastically reduce computational effort. Let d be the maximum degree and t be the iteration number. The computational effort needed per agent is exponential only in O(td) (note that the number of possible information sets of a neighbor at time t is itself exponential in td). Under appropriate assumptions on the rate of convergence, we deduce that each agent is only required to spend polylogarithmic (in 1/\eps) computational effort to approximately learn the true state of the world with error probability \eps, on regular trees of degree at least five. We provide numerical and other evidence to justify our assumption on convergence rate. We extend our results in various directions, including loopy graphs. Our results indicate efficiency of iterative Bayesian social learning in a wide range of situations, contrary to widely held beliefs.Comment: 11 pages, 1 figure, submitte

arXiv.org e-Print Archive

Caltech Authors

A very simple safe-Bayesian random forest

Author: Ghahramani Zoubin
Quadrianto Novi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2015
Field of study

Random forests works by averaging several predictions of de-correlated trees. We show a conceptually radical approach to generate a random forest: random sampling of many trees from a prior distribution, and subsequently performing a weighted ensemble of predictive probabilities. Our approach uses priors that allow sampling of decision trees even before looking at the data, and a power likelihood that explores the space spanned by combination of decision trees. While each tree performs Bayesian inference to compute its predictions, our aggregation procedure uses the power likelihood rather than the likelihood and is therefore strictly speaking not Bayesian. Nonetheless, we refer to it as a Bayesian random forest but with a built-in safety. The safeness comes as it has good predictive performance even if the underlying probabilistic model is wrong. We demonstrate empirically that our Safe-Bayesian random forest outperforms MCMC or SMC based Bayesian decision trees in term of speed and accuracy, and achieves competitive performance to entropy or Gini optimised random forest, yet is very simple to construct

Crossref

Sussex Research Online