5,190 research outputs found
Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games
Many artificial intelligence (AI) applications often require multiple
intelligent agents to work in a collaborative effort. Efficient learning for
intra-agent communication and coordination is an indispensable step towards
general AI. In this paper, we take StarCraft combat game as a case study, where
the task is to coordinate multiple agents as a team to defeat their enemies. To
maintain a scalable yet effective communication protocol, we introduce a
Multiagent Bidirectionally-Coordinated Network (BiCNet ['bIknet]) with a
vectorised extension of actor-critic formulation. We show that BiCNet can
handle different types of combats with arbitrary numbers of AI agents for both
sides. Our analysis demonstrates that without any supervisions such as human
demonstrations or labelled data, BiCNet could learn various types of advanced
coordination strategies that have been commonly used by experienced game
players. In our experiments, we evaluate our approach against multiple
baselines under different scenarios; it shows state-of-the-art performance, and
possesses potential values for large-scale real-world applications.Comment: 10 pages, 10 figures. Previously as title: "Multiagent
Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat
Games", Mar 201
Optimal pilot decisions and flight trajectories in air combat
The thesis concerns the analysis and synthesis of pilot decision-making and the design of optimal flight trajectories. In the synthesis framework, the methodology of influence diagrams is applied for modeling and simulating the maneuvering decision process of the pilot in one-on-one air combat. The influence diagram representations describing the maneuvering decision in a one sided optimization setting and in a game setting are constructed. The synthesis of team decision-making in a multiplayer air combat is tackled by formulating a decision theoretical information prioritization approach based on a value function and interval analysis. It gives the team optimal sequence of tactical data that is transmitted between cooperating air units for improving the situation awareness of the friendly pilots in the best possible way. In the optimal trajectory planning framework, an approach towards the interactive automated solution of deterministic aircraft trajectory optimization problems is presented. It offers design principles for a trajectory optimization software that can be operated automatically by a nonexpert user. In addition, the representation of preferences and uncertainties in trajectory optimization is considered by developing a multistage influence diagram that describes a series of the maneuvering decisions in a one-on-one air combat setting. This influence diagram representation as well as the synthesis elaborations provide seminal ways to treat uncertainties in air combat modeling. The work on influence diagrams can also be seen as the extension of the methodology to dynamically evolving decision situations involving possibly multiple actors with conflicting objectives. From the practical point of view, all the synthesis models can be utilized in decision-making systems of air combat simulators. The information prioritization approach can also be implemented in an onboard data link system.reviewe
Agent-based Modeling Methodology for Analyzing Weapons Systems
Getting as much information as possible to make decisions about acquisition of new weapons systems, through analysis of the weapons systems\u27 benefits and costs, yields better decisions. This study has twin goals. The first is to demonstrate a sound methodology to yield the most information about benefits of a particular weapon system. Second, to provide some baseline analysis of the benefits of a new type of missile, the Small Advanced Capability Missile (SACM) concept, in an unclassified general sense that will help improve further, more detailed, classified investigations into the benefits of this missile. In a simplified, unclassified scenario, we show that the SACM provides several advantages and we demonstrate a basis for further investigation into which tactics should be used in conjunction with the SACM. Furthermore, we discuss how each of the chosen factors influence the air combat scenario. Ultimately, we establish the usefulness of a designed experimental approach to analysis of agent-based simulation models and how agent-based models yield a great amount of information about the complex interactions of different actors on the battlefield
Operations Research in the High Tech Military Environment: A Survey
The use of operations research as a technology to solve many of the problems of government and industry has become a major field of study within the very short span of the last fifty years. In the paper entitled, Operations Research in the High Tech Military Environment: A Survey, the reader is provided with a better understanding of the tenets of operations research through an examination of a representative sample of the latest operations research applications developed for the high tech environment.
Initially, this involves providing the reader with some fundamental insights into what operations research is, what its practitioners do, and how the state-of-the-art has evolved to its present form. It then involves providing a brief description of what is meant by the term, high tech military environment. A survey, which constitutes the bulk of the material presented, focuses on how various operations research methodologies are being used within that environment. The paper concludes with a discussion of the possible directions operations research will take in the future, based on the present state-of-the-art
SCALING REINFORCEMENT LEARNING THROUGH FEUDAL MULTI-AGENT HIERARCHY
Militaries conduct wargames for training, planning, and research purposes. Artificial intelligence (AI) can improve military wargaming by reducing costs, speeding up the decision-making process, and offering new insights. Previous researchers explored using reinforcement learning (RL) for wargaming based on the successful use of RL for other human competitive games. While previous research has demonstrated that an RL agent can generate combat behavior, those experiments have been limited to small-scale wargames. This thesis investigates the feasibility and acceptability of -scaling hierarchical reinforcement learning (HRL) to support integrating AI into large military wargames. Additionally, this thesis also investigates potential complications that arise when replacing the opposing force with an intelligent agent by exploring the ways in which an intelligent agent can cause a wargame to fail. The resources required to train a feudal multi-agent hierarchy (FMH) and a standard RL agent and their effectiveness are compared in increasingly complicated wargames. While FMH fails to demonstrate the performance required for large wargames, it offers insight for future HRL research. Finally, the Department of Defense verification, validation, and accreditation process is proposed as a method to ensure that any future AI application applied to wargames are suitable.Lieutenant Colonel, United States ArmyApproved for public release. Distribution is unlimited
A Methodological Framework for Parametric Combat Analysis
This work presents a taxonomic structure for understanding the tension between certain factors of stability for game-theoretic outcomes such as Nash optimality, Pareto optimality, and balance optimality and then applies such game-theoretic concepts to the advancement of strategic thought on spacepower. This work successfully adapts and applies combat modeling theory to the evaluation of cislunar space conflict. This work provides evidence that the reliability characteristics of small spacecraft share similarities to the reliability characteristics of large spacecraft. Using these novel foundational concepts, this dissertation develops and presents a parametric methodological framework capable of analyzing the efficacy of heterogeneous force compositions in the context of space warfare. This framework is shown to be capable of predicting a stochastic distribution of numerical outcomes associated with various modes of conflict and parameter values. Furthermore, this work demonstrates a general alignment in results between the game-theoretic concepts of the framework and Media Interaction Warfare Theory in terms of evaluating force efficacy, providing strong evidence for the validity of the methodological framework presented in this dissertation
Single- and multiobjective reinforcement learning in dynamic adversarial games
This thesis uses reinforcement learning (RL) to address dynamic adversarial games in the context of air combat manoeuvring simulation. A sequential decision problem commonly encountered in the field of operations research, air combat manoeuvring simulation conventionally relied on agent programming methods that required significant domain knowledge to be manually encoded into the simulation environment. These methods are appropriate for determining the effectiveness of existing tactics in different simulated scenarios. However, in order to maximise the advantages provided by new technologies (such as autonomous aircraft), new tactics will need to be discovered. A proven technique for solving sequential decision problems, RL has the potential to discover these new tactics. This thesis explores four RL approaches—tabular, deep, discrete-to-deep and multiobjective— as mechanisms for discovering new behaviours in simulations of air combat manoeuvring. Itimplements and tests several methods for each approach and compares those methods in terms of the learning time, baseline and comparative performances, and implementation complexity. In addition to evaluating the utility of existing approaches to the specific task of air combat manoeuvring, this thesis proposes and investigates two novel methods, discrete-to-deep supervised policy learning (D2D-SPL) and discrete-to-deep supervised Q-value learning (D2D-SQL), which can be applied more generally. D2D-SPL and D2D-SQL offer the generalisability of deep RL at a cost closer to the tabular approach.Doctor of Philosoph
21st Century Simulation: Exploiting High Performance Computing and Data Analysis
This paper identifies, defines, and analyzes the limitations imposed on Modeling and Simulation by outmoded
paradigms in computer utilization and data analysis. The authors then discuss two emerging capabilities to
overcome these limitations: High Performance Parallel Computing and Advanced Data Analysis. First, parallel
computing, in supercomputers and Linux clusters, has proven effective by providing users an advantage in
computing power. This has been characterized as a ten-year lead over the use of single-processor computers.
Second, advanced data analysis techniques are both necessitated and enabled by this leap in computing power.
JFCOM's JESPP project is one of the few simulation initiatives to effectively embrace these concepts. The
challenges facing the defense analyst today have grown to include the need to consider operations among non-combatant
populations, to focus on impacts to civilian infrastructure, to differentiate combatants from non-combatants,
and to understand non-linear, asymmetric warfare. These requirements stretch both current
computational techniques and data analysis methodologies. In this paper, documented examples and potential
solutions will be advanced. The authors discuss the paths to successful implementation based on their experience.
Reviewed technologies include parallel computing, cluster computing, grid computing, data logging, OpsResearch,
database advances, data mining, evolutionary computing, genetic algorithms, and Monte Carlo sensitivity analyses.
The modeling and simulation community has significant potential to provide more opportunities for training and
analysis. Simulations must include increasingly sophisticated environments, better emulations of foes, and more
realistic civilian populations. Overcoming the implementation challenges will produce dramatically better insights,
for trainees and analysts. High Performance Parallel Computing and Advanced Data Analysis promise increased
understanding of future vulnerabilities to help avoid unneeded mission failures and unacceptable personnel losses.
The authors set forth road maps for rapid prototyping and adoption of advanced capabilities. They discuss the
beneficial impact of embracing these technologies, as well as risk mitigation required to ensure success
- …