5,230 research outputs found
Benchmarking for Bayesian Reinforcement Learning
In the Bayesian Reinforcement Learning (BRL) setting, agents try to maximise
the collected rewards while interacting with their environment while using some
prior knowledge that is accessed beforehand. Many BRL algorithms have already
been proposed, but even though a few toy examples exist in the literature,
there are still no extensive or rigorous benchmarks to compare them. The paper
addresses this problem, and provides a new BRL comparison methodology along
with the corresponding open source library. In this methodology, a comparison
criterion that measures the performance of algorithms on large sets of Markov
Decision Processes (MDPs) drawn from some probability distributions is defined.
In order to enable the comparison of non-anytime algorithms, our methodology
also includes a detailed analysis of the computation time requirement of each
algorithm. Our library is released with all source code and documentation: it
includes three test problems, each of which has two different prior
distributions, and seven state-of-the-art RL algorithms. Finally, our library
is illustrated by comparing all the available algorithms and the results are
discussed.Comment: 37 page
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Embodied Artificial Intelligence through Distributed Adaptive Control: An Integrated Framework
In this paper, we argue that the future of Artificial Intelligence research
resides in two keywords: integration and embodiment. We support this claim by
analyzing the recent advances of the field. Regarding integration, we note that
the most impactful recent contributions have been made possible through the
integration of recent Machine Learning methods (based in particular on Deep
Learning and Recurrent Neural Networks) with more traditional ones (e.g.
Monte-Carlo tree search, goal babbling exploration or addressable memory
systems). Regarding embodiment, we note that the traditional benchmark tasks
(e.g. visual classification or board games) are becoming obsolete as
state-of-the-art learning algorithms approach or even surpass human performance
in most of them, having recently encouraged the development of first-person 3D
game platforms embedding realistic physics. Building upon this analysis, we
first propose an embodied cognitive architecture integrating heterogenous
sub-fields of Artificial Intelligence into a unified framework. We demonstrate
the utility of our approach by showing how major contributions of the field can
be expressed within the proposed framework. We then claim that benchmarking
environments need to reproduce ecologically-valid conditions for bootstrapping
the acquisition of increasingly complex cognitive skills through the concept of
a cognitive arms race between embodied agents.Comment: Updated version of the paper accepted to the ICDL-Epirob 2017
conference (Lisbon, Portugal
VIME: Variational Information Maximizing Exploration
Scalable and effective exploration remains a key challenge in reinforcement
learning (RL). While there are methods with optimality guarantees in the
setting of discrete state and action spaces, these methods cannot be applied in
high-dimensional deep RL scenarios. As such, most contemporary RL relies on
simple heuristics such as epsilon-greedy exploration or adding Gaussian noise
to the controls. This paper introduces Variational Information Maximizing
Exploration (VIME), an exploration strategy based on maximization of
information gain about the agent's belief of environment dynamics. We propose a
practical implementation, using variational inference in Bayesian neural
networks which efficiently handles continuous state and action spaces. VIME
modifies the MDP reward function, and can be applied with several different
underlying RL algorithms. We demonstrate that VIME achieves significantly
better performance compared to heuristic exploration methods across a variety
of continuous control tasks and algorithms, including tasks with very sparse
rewards.Comment: Published in Advances in Neural Information Processing Systems 29
(NIPS), pages 1109-111
- …