62,784 research outputs found
Batch Reinforcement Learning on the Industrial Benchmark: First Experiences
The Particle Swarm Optimization Policy (PSO-P) has been recently introduced
and proven to produce remarkable results on interacting with academic
reinforcement learning benchmarks in an off-policy, batch-based setting. To
further investigate the properties and feasibility on real-world applications,
this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a
novel reinforcement learning (RL) benchmark that aims at being realistic by
including a variety of aspects found in industrial applications, like
continuous state and action spaces, a high dimensional, partially observable
state space, delayed effects, and complex stochasticity. The experimental
results of PSO-P on IB are compared to results of closed-form control policies
derived from the model-based Recurrent Control Neural Network (RCNN) and the
model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not
only of interest for academic benchmarks, but also for real-world industrial
applications, since it also yielded the best performing policy in our IB
setting. Compared to other well established RL techniques, PSO-P produced
outstanding results in performance and robustness, requiring only a relatively
low amount of effort in finding adequate parameters or making complex design
decisions
Propagation Networks for Model-Based Control Under Partial Observation
There has been an increasing interest in learning dynamics simulators for
model-based control. Compared with off-the-shelf physics engines, a learnable
simulator can quickly adapt to unseen objects, scenes, and tasks. However,
existing models like interaction networks only work for fully observable
systems; they also only consider pairwise interactions within a single time
step, both restricting their use in practical systems. We introduce Propagation
Networks (PropNet), a differentiable, learnable dynamics model that handles
partially observable scenarios and enables instantaneous propagation of signals
beyond pairwise interactions. Experiments show that our propagation networks
not only outperform current learnable physics engines in forward simulation,
but also achieve superior performance on various control tasks. Compared with
existing model-free deep reinforcement learning algorithms, model-based control
with propagation networks is more accurate, efficient, and generalizable to
new, partially observable scenes and tasks.Comment: Accepted to ICRA 2019. Project Page: http://propnet.csail.mit.edu
Video: https://youtu.be/ZAxHXegkz4
- …