Search CORE

2 research outputs found

High fidelity progressive reinforcement learning for agile maneuvering UAVs

Author: Abbeel P.
Faust A.
Kim H. J.
Lillicrap T. P.
Uzun S.
Yuksek B.
Zhang T.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 05/01/2020
Field of study

In this work, we present a high fidelity model based progressive reinforcement learning method for control system design for an agile maneuvering UAV. Our work relies on a simulation-based training and testing environment for doing software-in-the-loop (SIL), hardware-in-the-loop (HIL) and integrated flight testing within photo-realistic virtual reality (VR) environment. Through progressive learning with the high fidelity agent and environment models, the guidance and control policies build agile maneuvering based on fundamental control laws. First, we provide insight on development of high fidelity mathematical models using frequency domain system identification. These models are later used to design reinforcement learning based adaptive flight control laws allowing the vehicle to be controlled over a wide range of operating conditions covering model changes on operating conditions such as payload, voltage and damage to actuators and electronic speed controllers (ESCs). We later design outer flight guidance and control laws. Our current work and progress is summarized in this work

Crossref

Cranfield CERES

Development of UCAV fleet autonomy by reinforcement learning in a wargame simulation environment

Author: Demirezen Umut M.
Inalhan Gokhan
Yuksek Burak
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 04/01/2021
Field of study

In this study, we develop a machine learning based fleet autonomy for Unmanned Combat Aerial Vehicles (UCAVs) utilizing a synthetic simulation-based wargame environment. Aircraft survivability is modeled as Markov processes. Mission success metrics are developed to introduce collision avoidance and survival probability of the fleet. Flight path planning is performed utilizing the proximal policy optimization (PPO) based reinforcement learning method to obtain attack patterns with a multi-objective mission success criteria corresponding to the mission success metrics. Performance of the proposed system is evaluated by utilizing the Monte Carlo analysis in which a wider initial position interval is used when compared to the defined interval in the training phase. This provides a preliminary insight about the generalization ability of the RL agen

Crossref

Cranfield CERES