Search CORE

5,568 research outputs found

Bayesian Reinforcement Learning via Deep, Sparse Sampling

Author: Basu Debabrota
Dimitrakakis Christos
Grover Divya
Publication venue
Publication date: 01/01/2020
Field of study

We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal policy, with a lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward in discrete environments.Comment: Published in AISTATS 202

arXiv.org e-Print Archive

Chalmers Research

Cover Tree Bayesian Reinforcement Learning

Author: Blekas Konstantinos
Dimitrakakis Christos
Tziortziotis Nikolaos
Publication venue
Publication date: 08/12/2013
Field of study

This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration policies in unknown environments. The flexibility and computational simplicity of the model render it suitable for many reinforcement learning problems in continuous state spaces. We demonstrate this in an experimental comparison with least squares policy iteration

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Chalmers Research

Chalmers Publication Library

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search

Author: Dayan Peter
Guez Arthur
Silver David
Publication venue
Publication date: 01/01/2012
Field of study

Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems -- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.Comment: 14 pages, 7 figures, includes supplementary material. Advances in Neural Information Processing Systems (NIPS) 201

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Recommended from our members

State-of-the-art on research and applications of machine learning in the building life cycle

Author: Hong T
Luo X
Wang Z
Zhang W
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science

eScholarship - University of California

Optimal treatment allocations in space and time for on-line control of an emerging infectious disease

Author: Agarwal A.
Anderson R. M.
Bertsekas D. P.
Borth D. M.
Chapelle O.
Chapelle O.
Chesterton G. K.
Choi A. L.
Cox D. R.
Deardon R.
Estrada E.
Field K.
Gelman A.
Ghavamzadeh M.
Ghavamzadeh M.
Huang C.‐Y.
Kushner H. J.
Law A. M.
Little R. J.
Lusher D.
Mahadevan S.
May B. C.
Murphy S. A.
Murphy S. A.
Nahum‐Shani I.
Newton M. A.
Orellana L.
Osband I.
Palmer J. M.
Poupart P.
Ross S.
Russo D.
Sen A.
Spall J. C.
Subcommittee on Fisheries Wildlife, and Oceans
Sutton R.
Sutton R. S.
West M.
Yin G.
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

A key component in controlling the spread of an epidemic is deciding where, whenand to whom to apply an intervention.We develop a framework for using data to informthese decisionsin realtime.We formalize a treatment allocation strategy as a sequence of functions, oneper treatment period, that map up-to-date information on the spread of an infectious diseaseto a subset of locations where treatment should be allocated. An optimal allocation strategyoptimizes some cumulative outcome, e.g. the number of uninfected locations, the geographicfootprint of the disease or the cost of the epidemic. Estimation of an optimal allocation strategyfor an emerging infectious disease is challenging because spatial proximity induces interferencebetween locations, the number of possible allocations is exponential in the number oflocations, and because disease dynamics and intervention effectiveness are unknown at outbreak.We derive a Bayesian on-line estimator of the optimal allocation strategy that combinessimulation–optimization with Thompson sampling.The estimator proposed performs favourablyin simulation experiments. This work is motivated by and illustrated using data on the spread ofwhite nose syndrome, which is a highly fatal infectious disease devastating bat populations inNorth America

Crossref

eScholarship - University of California