2,399 research outputs found
COMPLEXITY OF STOCHASTIC BRANCH AND BOUND METHODS FOR BELIEF TREE SEARCH IN BAYESIAN REINFORCEMENT LEARNING
There has been a lot of recent work on Bayesian methods for reinforcement learning exhibiting near-optimal online performance. The main obstacle facing such methods is that in most problems of interest, the optimal solution involves planning in an infinitely large tree. However, it is possible to obtain stochastic lower and upper bounds on the value of each tree node. This enables us to use stochastic branch and bound algorithms to search the tree efficiently. This paper proposes some algorithms and examines their complexity in this setting
Bayesian Reinforcement Learning via Deep, Sparse Sampling
We address the problem of Bayesian reinforcement learning using efficient
model-based online planning. We propose an optimism-free Bayes-adaptive
algorithm to induce deeper and sparser exploration with a theoretical bound on
its performance relative to the Bayes optimal policy, with a lower
computational complexity. The main novelty is the use of a candidate policy
generator, to generate long-term options in the planning tree (over beliefs),
which allows us to create much sparser and deeper trees. Experimental results
on different environments show that in comparison to the state-of-the-art, our
algorithm is both computationally more efficient, and obtains significantly
higher reward in discrete environments.Comment: Published in AISTATS 202
Cover Tree Bayesian Reinforcement Learning
This paper proposes an online tree-based Bayesian approach for reinforcement
learning. For inference, we employ a generalised context tree model. This
defines a distribution on multivariate Gaussian piecewise-linear models, which
can be updated in closed form. The tree structure itself is constructed using
the cover tree method, which remains efficient in high dimensional spaces. We
combine the model with Thompson sampling and approximate dynamic programming to
obtain effective exploration policies in unknown environments. The flexibility
and computational simplicity of the model render it suitable for many
reinforcement learning problems in continuous state spaces. We demonstrate this
in an experimental comparison with least squares policy iteration
Benchmarking for Bayesian Reinforcement Learning
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise
the collected rewards while interacting with their environment while using some
prior knowledge that is accessed beforehand. Many BRL algorithms have already
been proposed, but even though a few toy examples exist in the literature,
there are still no extensive or rigorous benchmarks to compare them. The paper
addresses this problem, and provides a new BRL comparison methodology along
with the corresponding open source library. In this methodology, a comparison
criterion that measures the performance of algorithms on large sets of Markov
Decision Processes (MDPs) drawn from some probability distributions is defined.
In order to enable the comparison of non-anytime algorithms, our methodology
also includes a detailed analysis of the computation time requirement of each
algorithm. Our library is released with all source code and documentation: it
includes three test problems, each of which has two different prior
distributions, and seven state-of-the-art RL algorithms. Finally, our library
is illustrated by comparing all the available algorithms and the results are
discussed.Comment: 37 page
Mondrian Forests for Large-Scale Regression when Uncertainty Matters
Many real-world regression problems demand a measure of the uncertainty
associated with each prediction. Standard decision forests deliver efficient
state-of-the-art predictive performance, but high-quality uncertainty estimates
are lacking. Gaussian processes (GPs) deliver uncertainty estimates, but
scaling GPs to large-scale data sets comes at the cost of approximating the
uncertainty estimates. We extend Mondrian forests, first proposed by
Lakshminarayanan et al. (2014) for classification problems, to the large-scale
non-parametric regression setting. Using a novel hierarchical Gaussian prior
that dovetails with the Mondrian forest framework, we obtain principled
uncertainty estimates, while still retaining the computational advantages of
decision forests. Through a combination of illustrative examples, real-world
large-scale datasets, and Bayesian optimization benchmarks, we demonstrate that
Mondrian forests outperform approximate GPs on large-scale regression tasks and
deliver better-calibrated uncertainty assessments than decision-forest-based
methods.Comment: Proceedings of the 19th International Conference on Artificial
Intelligence and Statistics (AISTATS) 2016, Cadiz, Spain. JMLR: W&CP volume
5
- …