337 research outputs found

    Exploiting Opponent Modeling For Learning In Multi-agent Adversarial Games

    Get PDF
    An issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent’s actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of predictions about the players’ physical movements, to learn team policies. To explore the problem of decision-making in multi-agent adversarial scenarios, we use our approach for both offline play generation and real-time team response in the Rush 2008 American football simulator. Simultaneously predicting the movement trajectories, future reward, and play strategies of multiple players in real-time is a daunting task but we illustrate how it is possible to divide and conquer this problem with an assortment of data-driven models. By leveraging spatio-temporal traces of player movements, we learn discriminative models of defensive play for opponent modeling. With the reward information from previous play matchups, we use a modified version of UCT (Upper Conference Bounds applied to Trees) to create new offensive plays and to learn play repairs to counter predicted opponent actions. iii In team games, players must coordinate effectively to accomplish tasks while foiling their opponents either in a preplanned or emergent manner. An effective team policy must generate the necessary coordination, yet considering all possibilities for creating coordinating subgroups is computationally infeasible. Automatically identifying and preserving the coordination between key subgroups of teammates can make search more productive by pruning policies that disrupt these relationships. We demonstrate that combining opponent modeling with automatic subgroup identification can be used to create team policies with a higher average yardage than either the baseline game or domain-specific heuristics

    Multiagent Learning Through Indirect Encoding

    Get PDF
    Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding

    Cooperative agent-based software architecture for distributed simulation

    Get PDF
    This paper proposes a cooperative multiagent model using distributed object-based systems for supporting distributed virtual environment and distributed simulation technologies for military and government applications. The agent model will use the condition-event driven rule based system as the basis for representing knowledge. In this model, the updates and revision of beliefs of agents corresponds to modifying the knowledge base. These agents are reactive and respond to stimulus as well as the environment in which they are embedded. Further, these agents are smart and can learn from their actions. The distributed agent-based software architecture will enable us to realise human behaviour model environment and computer-generated forces (also called computer-generated actor (CGA)) architectures. The design of the cooperative agent-based architecture will be based on mobile agents, interactive distributed computing models, and advanced logical modes of programming. This cooperative architecture will be developed using Java based tools and distributed databases

    Cooperative agent-based software architecture for distributed simulation

    Get PDF

    TESTING DECEPTION WITH A COMMERCIAL TOOL SIMULATING CYBERSPACE

    Get PDF
    Deception methods have been applied to the traditional domains of war (air, land, sea, and space). In the newest domain of cyber, deception can be studied to see how it can be best used. Cyberspace operations are an essential warfighting domain within the Department of Defense (DOD). Many training exercises and courses have been developed to aid leadership with planning and to execute cyberspace effects that support operations. However, only a few simulations train cyber operators about how to respond to cyberspace threats. This work tested a commercial product from Soar Technologies (Soar Tech) that simulates conflict in cyberspace. The Cyberspace Course of Action Tool (CCAT) is a decision-support tool that evaluates defensive deception in a wargame simulating a local-area network being attacked. Results showed that defensive deception methods of decoys and bait could be effective in cyberspace. This could help military cyber defenses since their digital infrastructure is threatened daily with cyberattacks.Marine Forces Cyberspace CommandChief Petty Officer, United States NavyChief Petty Officer, United States NavyApproved for public release. Distribution is unlimited

    Optimal Blends of History and Intelligence for Robust Antiterrorism Policy

    Get PDF
    Abstract Antiterrorism analysis requires that security agencies blend evidence on historical patterns of terrorist behavior with incomplete intelligence on terrorist adversaries to predict possible terrorist operations and devise appropriate countermeasures. We model interactions between reactive, adaptive and intelligent adversaries embedded in minimally sufficient organizational settings to study the optimal analytic mixture, expressed as historical memory reach-back and the number of anticipatory scenarios, that should be used to design antiterrorism policy. We show that history is a valuable source of information when the terrorist organization evolves and acquires new capabilities at such a rapid pace that makes optimal strategies advocated by game-theoretic reasoning unlikely to succeed
    • …
    corecore