Batch Reinforcement Learning on the Industrial Benchmark: First
  Experiences

Hein, Daniel; Hentschel, Alexander; Runkler, Thomas A.; Sterzing, Volkmar; Tokic, Michel; Udluft, Steffen

research

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

Authors: Daniel Hein
Alexander Hentschel
Thomas A. Runkler
Volkmar Sterzing
Michel Tokic
Steffen Udluft
Publication date: 27 July 2017
Publisher: 'Institute of Electrical and Electronics Engineers (IEEE)'
Doi

Abstract

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions

Similar works

Full text

Available Versions

Crossref

info:doi/10.1109%2Fijcnn.2017....

Last time updated on 03/01/2020