17,385 research outputs found
Learning Unmanned Aerial Vehicle Control for Autonomous Target Following
While deep reinforcement learning (RL) methods have achieved unprecedented
successes in a range of challenging problems, their applicability has been
mainly limited to simulation or game domains due to the high sample complexity
of the trial-and-error learning process. However, real-world robotic
applications often need a data-efficient learning process with safety-critical
constraints. In this paper, we consider the challenging problem of learning
unmanned aerial vehicle (UAV) control for tracking a moving target. To acquire
a strategy that combines perception and control, we represent the policy by a
convolutional neural network. We develop a hierarchical approach that combines
a model-free policy gradient method with a conventional feedback
proportional-integral-derivative (PID) controller to enable stable learning
without catastrophic failure. The neural network is trained by a combination of
supervised learning from raw images and reinforcement learning from games of
self-play. We show that the proposed approach can learn a target following
policy in a simulator efficiently and the learned behavior can be successfully
transferred to the DJI quadrotor platform for real-world UAV control
Meta-Reinforcement Learning for Adaptive Control of Second Order Systems
Meta-learning is a branch of machine learning which aims to synthesize data
from a distribution of related tasks to efficiently solve new ones. In process
control, many systems have similar and well-understood dynamics, which suggests
it is feasible to create a generalizable controller through meta-learning. In
this work, we formulate a meta reinforcement learning (meta-RL) control
strategy that takes advantage of known, offline information for training, such
as a model structure. The meta-RL agent is trained over a distribution of model
parameters, rather than a single model, enabling the agent to automatically
adapt to changes in the process dynamics while maintaining performance. A key
design element is the ability to leverage model-based information offline
during training, while maintaining a model-free policy structure for
interacting with new environments. Our previous work has demonstrated how this
approach can be applied to the industrially-relevant problem of tuning
proportional-integral controllers to control first order processes. In this
work, we briefly reintroduce our methodology and demonstrate how it can be
extended to proportional-integral-derivative controllers and second order
systems.Comment: AdCONIP 2022. arXiv admin note: substantial text overlap with
arXiv:2203.0966
Reinforcement Learning for UAV Attitude Control
Autopilot systems are typically composed of an "inner loop" providing
stability and control, while an "outer loop" is responsible for mission-level
objectives, e.g. way-point navigation. Autopilot systems for UAVs are
predominately implemented using Proportional, Integral Derivative (PID) control
systems, which have demonstrated exceptional performance in stable
environments. However more sophisticated control is required to operate in
unpredictable, and harsh environments. Intelligent flight control systems is an
active area of research addressing limitations of PID control most recently
through the use of reinforcement learning (RL) which has had success in other
applications such as robotics. However previous work has focused primarily on
using RL at the mission-level controller. In this work, we investigate the
performance and accuracy of the inner control loop providing attitude control
when using intelligent flight control systems trained with the state-of-the-art
RL algorithms, Deep Deterministic Gradient Policy (DDGP), Trust Region Policy
Optimization (TRPO) and Proximal Policy Optimization (PPO). To investigate
these unknowns we first developed an open-source high-fidelity simulation
environment to train a flight controller attitude control of a quadrotor
through RL. We then use our environment to compare their performance to that of
a PID controller to identify if using RL is appropriate in high-precision,
time-critical flight control.Comment: 13 pages, 9 figure
Inverter PQ Control With Trajectory Tracking Capability For Microgrids Based On Physics-informed Reinforcement Learning
The increasing penetration of inverter-based resources (IBRs) calls for an advanced active and reactive power (PQ) control strategy in microgrids. To enhance the controllability and flexibility of the IBRs, this paper proposed an adaptive PQ control method with trajectory tracking capability, combining model-based analysis, physics-informed reinforcement learning (RL), and power hardware-in-the-loop (HIL) experiments. First, model-based analysis proves that there exists an adaptive proportional-integral controller with time-varying gains that can ensure any exponential PQ output trajectory of IBRs. These gains consist of a constant factor and an exponentially decaying factor, which are then obtained using a model-free deep reinforcement learning approach known as the twin delayed deeper deterministic policy gradient. With the model-based derivation, the learning space of the RL agent is narrowed down from a function space to a real space, which reduces the training complexity significantly. Finally, the proposed method is verified through numerical simulation in MATLAB-Simulink and power HIL experiments in the CURENT center.With the physics-informed learning method, exponential response time constants can be freely assigned to IBRs, and they can follow any predefined trajectory without complicated gain tuning
Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs Using Proximal Policy Optimization
Contemporary autopilot systems for unmanned aerial vehicles (UAVs) are far
more limited in their flight envelope as compared to experienced human pilots,
thereby restricting the conditions UAVs can operate in and the types of
missions they can accomplish autonomously. This paper proposes a deep
reinforcement learning (DRL) controller to handle the nonlinear attitude
control problem, enabling extended flight envelopes for fixed-wing UAVs. A
proof-of-concept controller using the proximal policy optimization (PPO)
algorithm is developed, and is shown to be capable of stabilizing a fixed-wing
UAV from a large set of initial conditions to reference roll, pitch and
airspeed values. The training process is outlined and key factors for its
progression rate are considered, with the most important factor found to be
limiting the number of variables in the observation vector, and including
values for several previous time steps for these variables. The trained
reinforcement learning (RL) controller is compared to a
proportional-integral-derivative (PID) controller, and is found to converge in
more cases than the PID controller, with comparable performance. Furthermore,
the RL controller is shown to generalize well to unseen disturbances in the
form of wind and turbulence, even in severe disturbance conditions.Comment: 11 pages, 3 figures, 2019 International Conference on Unmanned
Aircraft Systems (ICUAS
Meta-Reinforcement Learning for the Tuning of PI Controllers: An Offline Approach
Meta-learning is a branch of machine learning which trains neural network
models to synthesize a wide variety of data in order to rapidly solve new
problems. In process control, many systems have similar and well-understood
dynamics, which suggests it is feasible to create a generalizable controller
through meta-learning. In this work, we formulate a meta reinforcement learning
(meta-RL) control strategy that can be used to tune proportional--integral
controllers. Our meta-RL agent has a recurrent structure that accumulates
"context" to learn a system's dynamics through a hidden state variable in
closed-loop. This architecture enables the agent to automatically adapt to
changes in the process dynamics. In tests reported here, the meta-RL agent was
trained entirely offline on first order plus time delay systems, and produced
excellent results on novel systems drawn from the same distribution of process
dynamics used for training. A key design element is the ability to leverage
model-based information offline during training in simulated environments while
maintaining a model-free policy structure for interacting with novel processes
where there is uncertainty regarding the true process dynamics. Meta-learning
is a promising approach for constructing sample-efficient intelligent
controllers.Comment: 23 pages; postprin
- …