30,223 research outputs found
A Benchmark Environment Motivated by Industrial Control Problems
In the research area of reinforcement learning (RL), frequently novel and
promising methods are developed and introduced to the RL community. However,
although many researchers are keen to apply their methods on real-world
problems, implementing such methods in real industry environments often is a
frustrating and tedious process. Generally, academic research groups have only
limited access to real industrial data and applications. For this reason, new
methods are usually developed, evaluated and compared by using artificial
software benchmarks. On one hand, these benchmarks are designed to provide
interpretable RL training scenarios and detailed insight into the learning
process of the method on hand. On the other hand, they usually do not share
much similarity with industrial real-world applications. For this reason we
used our industry experience to design a benchmark which bridges the gap
between freely available, documented, and motivated artificial benchmarks and
properties of real industrial problems. The resulting industrial benchmark (IB)
has been made publicly available to the RL community by publishing its Java and
Python code, including an OpenAI Gym wrapper, on Github. In this paper we
motivate and describe in detail the IB's dynamics and identify prototypic
experimental settings that capture common situations in real-world industry
control problems
A Benchmark Environment Motivated by Industrial Control Problems
In the research area of reinforcement learning (RL), frequently novel and
promising methods are developed and introduced to the RL community. However,
although many researchers are keen to apply their methods on real-world
problems, implementing such methods in real industry environments often is a
frustrating and tedious process. Generally, academic research groups have only
limited access to real industrial data and applications. For this reason, new
methods are usually developed, evaluated and compared by using artificial
software benchmarks. On one hand, these benchmarks are designed to provide
interpretable RL training scenarios and detailed insight into the learning
process of the method on hand. On the other hand, they usually do not share
much similarity with industrial real-world applications. For this reason we
used our industry experience to design a benchmark which bridges the gap
between freely available, documented, and motivated artificial benchmarks and
properties of real industrial problems. The resulting industrial benchmark (IB)
has been made publicly available to the RL community by publishing its Java and
Python code, including an OpenAI Gym wrapper, on Github. In this paper we
motivate and describe in detail the IB's dynamics and identify prototypic
experimental settings that capture common situations in real-world industry
control problems
Intrinsic Motivation Systems for Autonomous Mental Development
Exploratory activities seem to be intrinsically rewarding
for children and crucial for their cognitive development.
Can a machine be endowed with such an intrinsic motivation
system? This is the question we study in this paper, presenting a number of computational systems that try to capture this drive towards novel or curious situations. After discussing related research coming from developmental psychology, neuroscience, developmental robotics, and active learning, this paper presents the mechanism of Intelligent Adaptive Curiosity, an intrinsic motivation system which pushes a robot towards situations in which it maximizes its learning progress. This drive makes the robot focus on situations which are neither too predictable nor too unpredictable, thus permitting autonomous mental development.The complexity of the robot’s activities autonomously increases and complex developmental sequences self-organize without being constructed in a supervised manner. Two experiments are presented illustrating the stage-like organization emerging with this mechanism. In one of them, a physical robot is placed on a baby play mat with objects that it can learn to manipulate. Experimental results show that the robot first spends time in situations
which are easy to learn, then shifts its attention progressively to situations of increasing difficulty, avoiding situations in which nothing can be learned. Finally, these various results are discussed in relation to more complex forms of behavioral organization and data coming from developmental psychology.
Key words: Active learning, autonomy, behavior, complexity,
curiosity, development, developmental trajectory, epigenetic
robotics, intrinsic motivation, learning, reinforcement learning,
values
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
- …