62,784 research outputs found

    Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

    Full text link
    The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions

    Propagation Networks for Model-Based Control Under Partial Observation

    Full text link
    There has been an increasing interest in learning dynamics simulators for model-based control. Compared with off-the-shelf physics engines, a learnable simulator can quickly adapt to unseen objects, scenes, and tasks. However, existing models like interaction networks only work for fully observable systems; they also only consider pairwise interactions within a single time step, both restricting their use in practical systems. We introduce Propagation Networks (PropNet), a differentiable, learnable dynamics model that handles partially observable scenarios and enables instantaneous propagation of signals beyond pairwise interactions. Experiments show that our propagation networks not only outperform current learnable physics engines in forward simulation, but also achieve superior performance on various control tasks. Compared with existing model-free deep reinforcement learning algorithms, model-based control with propagation networks is more accurate, efficient, and generalizable to new, partially observable scenes and tasks.Comment: Accepted to ICRA 2019. Project Page: http://propnet.csail.mit.edu Video: https://youtu.be/ZAxHXegkz4
    • …
    corecore