277 research outputs found
Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering
The application of artificial intelligence to simulate air-to-air combat
scenarios is attracting increasing attention. To date the high-dimensional
state and action spaces, the high complexity of situation information (such as
imperfect and filtered information, stochasticity, incomplete knowledge about
mission targets) and the nonlinear flight dynamics pose significant challenges
for accurate air combat decision-making. These challenges are exacerbated when
multiple heterogeneous agents are involved. We propose a hierarchical
multi-agent reinforcement learning framework for air-to-air combat with
multiple heterogeneous agents. In our framework, the decision-making process is
divided into two stages of abstraction, where heterogeneous low-level policies
control the action of individual units, and a high-level commander policy
issues macro commands given the overall mission targets. Low-level policies are
trained for accurate unit combat control. Their training is organized in a
learning curriculum with increasingly complex training scenarios and
league-based self-play. The commander policy is trained on mission targets
given pre-trained low-level policies. The empirical validation advocates the
advantages of our design choices.Comment: 22nd International Conference on Machine Learning and Applications
(ICMLA 23
TempFuser: Learning Tactical and Agile Flight Maneuvers in Aerial Dogfights using a Long Short-Term Temporal Fusion Transformer
In aerial combat, dogfighting poses intricate challenges that demand an
understanding of both strategic maneuvers and the aerodynamics of agile fighter
aircraft. In this paper, we introduce TempFuser, a novel long short-term
temporal fusion transformer designed to learn tactical and agile flight
maneuvers in aerial dogfights. Our approach employs two distinct LSTM-based
input embeddings to encode long-term sparse and short-term dense state
representations. By integrating these embeddings through a transformer encoder,
our model captures the tactics and agility of fighter jets, enabling it to
generate end-to-end flight commands that secure dominant positions and
outmaneuver the opponent. After extensive training against various types of
opponent aircraft in a high-fidelity flight simulator, our model successfully
learns to perform complex fighter maneuvers, consistently outperforming several
baseline models. Notably, our model exhibits human-like strategic maneuvers
even when facing adversaries with superior specifications, all without relying
on explicit prior knowledge. Moreover, it demonstrates robust pursuit
performance in challenging supersonic and low-altitude environments. Demo
videos are available at https://sites.google.com/view/tempfuser.Comment: 7 pages, 8 figure
Intelligent Aircraft Maneuvering Decision Based on CNN
© 2019 Association for Computing Machinery. Aiming at the maneuvering decision of aircraft in air combat, an intelligent maneuvering decision model based on convolutional neural network(CNN) is proposed in this paper. Firstly, the situation data, maneuvering decision variables and evaluation indexs are given, and a CNN model that can realize intelligent maneuvering decision is established. Then, according to the evaluation indexes, the structure and parameters of the CNN model are adjusted through the simulation experiments to improve the accuracy and robustness of the maneuvering decision. After that, the validity of the intelligent maneuvering decision model proposed in this paper is verified through comparative experiments that the CNN model can make stable maneuvering decisions with high accuracy. Finally, the flight path in an air combat process is presented
Communication and Control in Collaborative UAVs: Recent Advances and Future Trends
The recent progress in unmanned aerial vehicles (UAV) technology has
significantly advanced UAV-based applications for military, civil, and
commercial domains. Nevertheless, the challenges of establishing high-speed
communication links, flexible control strategies, and developing efficient
collaborative decision-making algorithms for a swarm of UAVs limit their
autonomy, robustness, and reliability. Thus, a growing focus has been witnessed
on collaborative communication to allow a swarm of UAVs to coordinate and
communicate autonomously for the cooperative completion of tasks in a short
time with improved efficiency and reliability. This work presents a
comprehensive review of collaborative communication in a multi-UAV system. We
thoroughly discuss the characteristics of intelligent UAVs and their
communication and control requirements for autonomous collaboration and
coordination. Moreover, we review various UAV collaboration tasks, summarize
the applications of UAV swarm networks for dense urban environments and present
the use case scenarios to highlight the current developments of UAV-based
applications in various domains. Finally, we identify several exciting future
research direction that needs attention for advancing the research in
collaborative UAVs
Single- and multiobjective reinforcement learning in dynamic adversarial games
This thesis uses reinforcement learning (RL) to address dynamic adversarial games in the context of air combat manoeuvring simulation. A sequential decision problem commonly encountered in the field of operations research, air combat manoeuvring simulation conventionally relied on agent programming methods that required significant domain knowledge to be manually encoded into the simulation environment. These methods are appropriate for determining the effectiveness of existing tactics in different simulated scenarios. However, in order to maximise the advantages provided by new technologies (such as autonomous aircraft), new tactics will need to be discovered. A proven technique for solving sequential decision problems, RL has the potential to discover these new tactics. This thesis explores four RL approaches—tabular, deep, discrete-to-deep and multiobjective— as mechanisms for discovering new behaviours in simulations of air combat manoeuvring. Itimplements and tests several methods for each approach and compares those methods in terms of the learning time, baseline and comparative performances, and implementation complexity. In addition to evaluating the utility of existing approaches to the specific task of air combat manoeuvring, this thesis proposes and investigates two novel methods, discrete-to-deep supervised policy learning (D2D-SPL) and discrete-to-deep supervised Q-value learning (D2D-SQL), which can be applied more generally. D2D-SPL and D2D-SQL offer the generalisability of deep RL at a cost closer to the tabular approach.Doctor of Philosoph
Intelligent Autonomous Decision-Making and Cooperative Control Technology of High-Speed Vehicle Swarms
This book is a reprint of the Special Issue “Intelligent Autonomous Decision-Making and Cooperative Control Technology of High-Speed Vehicle Swarms”,which was published in Applied Sciences
DRL-RNP: deep reinforcement learning-based optimized RNP flight procedure execution.
The required navigation performance (RNP) procedure is one of the two basic navigation specifications for the performance-based navigation (PBN) procedure as proposed by the International Civil Aviation Organization (ICAO) through an integration of the global navigation infrastructures to improve the utilization efficiency of airspace and reduce flight delays and the dependence on ground navigation facilities. The approach stage is one of the most important and difficult stages in the whole flying. In this study, we proposed deep reinforcement learning (DRL)-based RNP procedure execution, DRL-RNP. By conducting an RNP approach procedure, the DRL algorithm was implemented, using a fixed-wing aircraft to explore a path of minimum fuel consumption with reward under windy conditions in compliance with the RNP safety specifications. The experimental results have demonstrated that the six degrees of freedom aircraft controlled by the DRL algorithm can successfully complete the RNP procedure whilst meeting the safety specifications for protection areas and obstruction clearance altitude in the whole procedure. In addition, the potential path with minimum fuel consumption can be explored effectively. Hence, the DRL method can be used not only to implement the RNP procedure with a simulated aircraft but also to help the verification and evaluation of the RNP procedure
- …