5,190 research outputs found

    Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games

    Get PDF
    Many artificial intelligence (AI) applications often require multiple intelligent agents to work in a collaborative effort. Efficient learning for intra-agent communication and coordination is an indispensable step towards general AI. In this paper, we take StarCraft combat game as a case study, where the task is to coordinate multiple agents as a team to defeat their enemies. To maintain a scalable yet effective communication protocol, we introduce a Multiagent Bidirectionally-Coordinated Network (BiCNet ['bIknet]) with a vectorised extension of actor-critic formulation. We show that BiCNet can handle different types of combats with arbitrary numbers of AI agents for both sides. Our analysis demonstrates that without any supervisions such as human demonstrations or labelled data, BiCNet could learn various types of advanced coordination strategies that have been commonly used by experienced game players. In our experiments, we evaluate our approach against multiple baselines under different scenarios; it shows state-of-the-art performance, and possesses potential values for large-scale real-world applications.Comment: 10 pages, 10 figures. Previously as title: "Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games", Mar 201

    Optimal pilot decisions and flight trajectories in air combat

    Get PDF
    The thesis concerns the analysis and synthesis of pilot decision-making and the design of optimal flight trajectories. In the synthesis framework, the methodology of influence diagrams is applied for modeling and simulating the maneuvering decision process of the pilot in one-on-one air combat. The influence diagram representations describing the maneuvering decision in a one sided optimization setting and in a game setting are constructed. The synthesis of team decision-making in a multiplayer air combat is tackled by formulating a decision theoretical information prioritization approach based on a value function and interval analysis. It gives the team optimal sequence of tactical data that is transmitted between cooperating air units for improving the situation awareness of the friendly pilots in the best possible way. In the optimal trajectory planning framework, an approach towards the interactive automated solution of deterministic aircraft trajectory optimization problems is presented. It offers design principles for a trajectory optimization software that can be operated automatically by a nonexpert user. In addition, the representation of preferences and uncertainties in trajectory optimization is considered by developing a multistage influence diagram that describes a series of the maneuvering decisions in a one-on-one air combat setting. This influence diagram representation as well as the synthesis elaborations provide seminal ways to treat uncertainties in air combat modeling. The work on influence diagrams can also be seen as the extension of the methodology to dynamically evolving decision situations involving possibly multiple actors with conflicting objectives. From the practical point of view, all the synthesis models can be utilized in decision-making systems of air combat simulators. The information prioritization approach can also be implemented in an onboard data link system.reviewe

    Agent-based Modeling Methodology for Analyzing Weapons Systems

    Get PDF
    Getting as much information as possible to make decisions about acquisition of new weapons systems, through analysis of the weapons systems\u27 benefits and costs, yields better decisions. This study has twin goals. The first is to demonstrate a sound methodology to yield the most information about benefits of a particular weapon system. Second, to provide some baseline analysis of the benefits of a new type of missile, the Small Advanced Capability Missile (SACM) concept, in an unclassified general sense that will help improve further, more detailed, classified investigations into the benefits of this missile. In a simplified, unclassified scenario, we show that the SACM provides several advantages and we demonstrate a basis for further investigation into which tactics should be used in conjunction with the SACM. Furthermore, we discuss how each of the chosen factors influence the air combat scenario. Ultimately, we establish the usefulness of a designed experimental approach to analysis of agent-based simulation models and how agent-based models yield a great amount of information about the complex interactions of different actors on the battlefield

    Operations Research in the High Tech Military Environment: A Survey

    Get PDF
    The use of operations research as a technology to solve many of the problems of government and industry has become a major field of study within the very short span of the last fifty years. In the paper entitled, Operations Research in the High Tech Military Environment: A Survey, the reader is provided with a better understanding of the tenets of operations research through an examination of a representative sample of the latest operations research applications developed for the high tech environment. Initially, this involves providing the reader with some fundamental insights into what operations research is, what its practitioners do, and how the state-of-the-art has evolved to its present form. It then involves providing a brief description of what is meant by the term, high tech military environment. A survey, which constitutes the bulk of the material presented, focuses on how various operations research methodologies are being used within that environment. The paper concludes with a discussion of the possible directions operations research will take in the future, based on the present state-of-the-art

    SCALING REINFORCEMENT LEARNING THROUGH FEUDAL MULTI-AGENT HIERARCHY

    Get PDF
    Militaries conduct wargames for training, planning, and research purposes. Artificial intelligence (AI) can improve military wargaming by reducing costs, speeding up the decision-making process, and offering new insights. Previous researchers explored using reinforcement learning (RL) for wargaming based on the successful use of RL for other human competitive games. While previous research has demonstrated that an RL agent can generate combat behavior, those experiments have been limited to small-scale wargames. This thesis investigates the feasibility and acceptability of -scaling hierarchical reinforcement learning (HRL) to support integrating AI into large military wargames. Additionally, this thesis also investigates potential complications that arise when replacing the opposing force with an intelligent agent by exploring the ways in which an intelligent agent can cause a wargame to fail. The resources required to train a feudal multi-agent hierarchy (FMH) and a standard RL agent and their effectiveness are compared in increasingly complicated wargames. While FMH fails to demonstrate the performance required for large wargames, it offers insight for future HRL research. Finally, the Department of Defense verification, validation, and accreditation process is proposed as a method to ensure that any future AI application applied to wargames are suitable.Lieutenant Colonel, United States ArmyApproved for public release. Distribution is unlimited

    A Methodological Framework for Parametric Combat Analysis

    Get PDF
    This work presents a taxonomic structure for understanding the tension between certain factors of stability for game-theoretic outcomes such as Nash optimality, Pareto optimality, and balance optimality and then applies such game-theoretic concepts to the advancement of strategic thought on spacepower. This work successfully adapts and applies combat modeling theory to the evaluation of cislunar space conflict. This work provides evidence that the reliability characteristics of small spacecraft share similarities to the reliability characteristics of large spacecraft. Using these novel foundational concepts, this dissertation develops and presents a parametric methodological framework capable of analyzing the efficacy of heterogeneous force compositions in the context of space warfare. This framework is shown to be capable of predicting a stochastic distribution of numerical outcomes associated with various modes of conflict and parameter values. Furthermore, this work demonstrates a general alignment in results between the game-theoretic concepts of the framework and Media Interaction Warfare Theory in terms of evaluating force efficacy, providing strong evidence for the validity of the methodological framework presented in this dissertation

    Single- and multiobjective reinforcement learning in dynamic adversarial games

    Get PDF
    This thesis uses reinforcement learning (RL) to address dynamic adversarial games in the context of air combat manoeuvring simulation. A sequential decision problem commonly encountered in the field of operations research, air combat manoeuvring simulation conventionally relied on agent programming methods that required significant domain knowledge to be manually encoded into the simulation environment. These methods are appropriate for determining the effectiveness of existing tactics in different simulated scenarios. However, in order to maximise the advantages provided by new technologies (such as autonomous aircraft), new tactics will need to be discovered. A proven technique for solving sequential decision problems, RL has the potential to discover these new tactics. This thesis explores four RL approaches—tabular, deep, discrete-to-deep and multiobjective— as mechanisms for discovering new behaviours in simulations of air combat manoeuvring. Itimplements and tests several methods for each approach and compares those methods in terms of the learning time, baseline and comparative performances, and implementation complexity. In addition to evaluating the utility of existing approaches to the specific task of air combat manoeuvring, this thesis proposes and investigates two novel methods, discrete-to-deep supervised policy learning (D2D-SPL) and discrete-to-deep supervised Q-value learning (D2D-SQL), which can be applied more generally. D2D-SPL and D2D-SQL offer the generalisability of deep RL at a cost closer to the tabular approach.Doctor of Philosoph

    21st Century Simulation: Exploiting High Performance Computing and Data Analysis

    Get PDF
    This paper identifies, defines, and analyzes the limitations imposed on Modeling and Simulation by outmoded paradigms in computer utilization and data analysis. The authors then discuss two emerging capabilities to overcome these limitations: High Performance Parallel Computing and Advanced Data Analysis. First, parallel computing, in supercomputers and Linux clusters, has proven effective by providing users an advantage in computing power. This has been characterized as a ten-year lead over the use of single-processor computers. Second, advanced data analysis techniques are both necessitated and enabled by this leap in computing power. JFCOM's JESPP project is one of the few simulation initiatives to effectively embrace these concepts. The challenges facing the defense analyst today have grown to include the need to consider operations among non-combatant populations, to focus on impacts to civilian infrastructure, to differentiate combatants from non-combatants, and to understand non-linear, asymmetric warfare. These requirements stretch both current computational techniques and data analysis methodologies. In this paper, documented examples and potential solutions will be advanced. The authors discuss the paths to successful implementation based on their experience. Reviewed technologies include parallel computing, cluster computing, grid computing, data logging, OpsResearch, database advances, data mining, evolutionary computing, genetic algorithms, and Monte Carlo sensitivity analyses. The modeling and simulation community has significant potential to provide more opportunities for training and analysis. Simulations must include increasingly sophisticated environments, better emulations of foes, and more realistic civilian populations. Overcoming the implementation challenges will produce dramatically better insights, for trainees and analysts. High Performance Parallel Computing and Advanced Data Analysis promise increased understanding of future vulnerabilities to help avoid unneeded mission failures and unacceptable personnel losses. The authors set forth road maps for rapid prototyping and adoption of advanced capabilities. They discuss the beneficial impact of embracing these technologies, as well as risk mitigation required to ensure success
    • …
    corecore