4 research outputs found
Effective Multi-Agent Deep Reinforcement Learning Control with Relative Entropy Regularization
In this paper, a novel Multi-agent Reinforcement Learning (MARL) approach,
Multi-Agent Continuous Dynamic Policy Gradient (MACDPP) was proposed to tackle
the issues of limited capability and sample efficiency in various scenarios
controlled by multiple agents. It alleviates the inconsistency of multiple
agents' policy updates by introducing the relative entropy regularization to
the Centralized Training with Decentralized Execution (CTDE) framework with the
Actor-Critic (AC) structure. Evaluated by multi-agent cooperation and
competition tasks and traditional control tasks including OpenAI benchmarks and
robot arm manipulation, MACDPP demonstrates significant superiority in learning
capability and sample efficiency compared with both related multi-agent and
widely implemented signal-agent baselines and therefore expands the potential
of MARL in effectively learning challenging control scenarios
Practical Probabilistic Model-based Deep Reinforcement Learning by Integrating Dropout Uncertainty and Trajectory Sampling
This paper addresses the prediction stability, prediction accuracy and
control capability of the current probabilistic model-based reinforcement
learning (MBRL) built on neural networks. A novel approach dropout-based
probabilistic ensembles with trajectory sampling (DPETS) is proposed where the
system uncertainty is stably predicted by combining the Monte-Carlo dropout and
trajectory sampling in one framework. Its loss function is designed to correct
the fitting error of neural networks for more accurate prediction of
probabilistic models. The state propagation in its policy is extended to filter
the aleatoric uncertainty for superior control capability. Evaluated by several
Mujoco benchmark control tasks under additional disturbances and one practical
robot arm manipulation task, DPETS outperforms related MBRL approaches in both
average return and convergence velocity while achieving superior performance
than well-known model-free baselines with significant sample efficiency. The
open source code of DPETS is available at https://github.com/mrjun123/DPETS
Efficient Uncertainty Propagation in Model-Based Reinforcement Learning Unmanned Surface Vehicle Using Unscented Kalman Filter
This article tackles the computational burden of propagating uncertainties in the model predictive controller-based policy of the probabilistic model-based reinforcement learning (MBRL) system for an unmanned surface vehicles system (USV). We proposed filtered probabilistic model predictive control using the unscented Kalman filter (FPMPC-UKF) that introduces the unscented Kalman filter (UKF) for a more efficient uncertainty propagation in MBRL. A USV control system based on FPMPC-UKF is developed and evaluated by position-keeping and target-reaching tasks in a real USV data-driven simulation. The experimental results demonstrate a significant superiority of the proposed method in balancing the control performance and computational burdens under different levels of disturbances compared with the related works of USV, and therefore indicate its potential in more challenging USV scenarios with limited computational resources