2 research outputs found

    High fidelity progressive reinforcement learning for agile maneuvering UAVs

    Get PDF
    In this work, we present a high fidelity model based progressive reinforcement learning method for control system design for an agile maneuvering UAV. Our work relies on a simulation-based training and testing environment for doing software-in-the-loop (SIL), hardware-in-the-loop (HIL) and integrated flight testing within photo-realistic virtual reality (VR) environment. Through progressive learning with the high fidelity agent and environment models, the guidance and control policies build agile maneuvering based on fundamental control laws. First, we provide insight on development of high fidelity mathematical models using frequency domain system identification. These models are later used to design reinforcement learning based adaptive flight control laws allowing the vehicle to be controlled over a wide range of operating conditions covering model changes on operating conditions such as payload, voltage and damage to actuators and electronic speed controllers (ESCs). We later design outer flight guidance and control laws. Our current work and progress is summarized in this work

    Development of UCAV fleet autonomy by reinforcement learning in a wargame simulation environment

    Get PDF
    In this study, we develop a machine learning based fleet autonomy for Unmanned Combat Aerial Vehicles (UCAVs) utilizing a synthetic simulation-based wargame environment. Aircraft survivability is modeled as Markov processes. Mission success metrics are developed to introduce collision avoidance and survival probability of the fleet. Flight path planning is performed utilizing the proximal policy optimization (PPO) based reinforcement learning method to obtain attack patterns with a multi-objective mission success criteria corresponding to the mission success metrics. Performance of the proposed system is evaluated by utilizing the Monte Carlo analysis in which a wider initial position interval is used when compared to the defined interval in the training phase. This provides a preliminary insight about the generalization ability of the RL agen
    corecore