337 research outputs found
Exploiting Opponent Modeling For Learning In Multi-agent Adversarial Games
An issue with learning effective policies in multi-agent adversarial games is that the size of the search space can be prohibitively large when the actions of both teammates and opponents are considered simultaneously. Opponent modeling, predicting an opponent’s actions in advance of execution, is one approach for selecting actions in adversarial settings, but it is often performed in an ad hoc way. In this dissertation, we introduce several methods for using opponent modeling, in the form of predictions about the players’ physical movements, to learn team policies. To explore the problem of decision-making in multi-agent adversarial scenarios, we use our approach for both offline play generation and real-time team response in the Rush 2008 American football simulator. Simultaneously predicting the movement trajectories, future reward, and play strategies of multiple players in real-time is a daunting task but we illustrate how it is possible to divide and conquer this problem with an assortment of data-driven models. By leveraging spatio-temporal traces of player movements, we learn discriminative models of defensive play for opponent modeling. With the reward information from previous play matchups, we use a modified version of UCT (Upper Conference Bounds applied to Trees) to create new offensive plays and to learn play repairs to counter predicted opponent actions. iii In team games, players must coordinate effectively to accomplish tasks while foiling their opponents either in a preplanned or emergent manner. An effective team policy must generate the necessary coordination, yet considering all possibilities for creating coordinating subgroups is computationally infeasible. Automatically identifying and preserving the coordination between key subgroups of teammates can make search more productive by pruning policies that disrupt these relationships. We demonstrate that combining opponent modeling with automatic subgroup identification can be used to create team policies with a higher average yardage than either the baseline game or domain-specific heuristics
Multiagent Learning Through Indirect Encoding
Designing a system of multiple, heterogeneous agents that cooperate to achieve a common goal is a difficult task, but it is also a common real-world problem. Multiagent learning addresses this problem by training the team to cooperate through a learning algorithm. However, most traditional approaches treat multiagent learning as a combination of multiple single-agent learning problems. This perspective leads to many inefficiencies in learning such as the problem of reinvention, whereby fundamental skills and policies that all agents should possess must be rediscovered independently for each team member. For example, in soccer, all the players know how to pass and kick the ball, but a traditional algorithm has no way to share such vital information because it has no way to relate the policies of agents to each other. In this dissertation a new approach to multiagent learning that seeks to address these issues is presented. This approach, called multiagent HyperNEAT, represents teams as a pattern of policies rather than individual agents. The main idea is that an agent’s location within a canonical team layout (such as a soccer team at the start of a game) tends to dictate its role within that team, called the policy geometry. For example, as soccer positions move from goal to center they become more offensive and less defensive, a concept that is compactly represented as a pattern. iii The first major contribution of this dissertation is a new method for evolving neural network controllers called HyperNEAT, which forms the foundation of the second contribution and primary focus of this work, multiagent HyperNEAT. Multiagent learning in this dissertation is investigated in predator-prey, room-clearing, and patrol domains, providing a real-world context for the approach. Interestingly, because the teams in multiagent HyperNEAT are represented as patterns they can scale up to an infinite number of multiagent policies that can be sampled from the policy geometry as needed. Thus the third contribution is a method for teams trained with multiagent HyperNEAT to dynamically scale their size without further learning. Fourth, the capabilities to both learn and scale in multiagent HyperNEAT are compared to the traditional multiagent SARSA(λ) approach in a comprehensive study. The fifth contribution is a method for efficiently learning and encoding multiple policies for each agent on a team to facilitate learning in multi-task domains. Finally, because there is significant interest in practical applications of multiagent learning, multiagent HyperNEAT is tested in a real-world military patrolling application with actual Khepera III robots. The ultimate goal is to provide a new perspective on multiagent learning and to demonstrate the practical benefits of training heterogeneous, scalable multiagent teams through generative encoding
Cooperative agent-based software architecture for distributed simulation
This paper proposes a cooperative multiagent model using distributed object-based systems for supporting distributed virtual environment and distributed simulation technologies for military and government applications. The agent model will use the condition-event driven rule based system as the basis for representing knowledge. In this model, the updates and revision of beliefs of agents corresponds to modifying the knowledge base. These agents are reactive and respond to stimulus as well as the environment in which they are embedded. Further, these agents are smart and can learn from their actions. The distributed agent-based software architecture will enable us to realise human behaviour model environment and computer-generated forces (also called computer-generated actor (CGA)) architectures. The design of the cooperative agent-based architecture will be based on mobile agents, interactive distributed computing models, and advanced logical modes of programming. This cooperative architecture will be developed using Java based tools and distributed databases
Recommended from our members
Swarm intelligence for autonomous cooperative agents in battles for real-time strategy games
This paper investigates the use the swarm intelligence of honey bees to create groups of co-operative AI for an RTS game in order to create and re-enact battle simulations. The behaviour of the agents are based on the foraging and defensive behaviours of honey bees, adapted to a human environment. The groups consist of multiple model-based reflex agents, with individual blackboards for working memory, with a colony level blackboard to mimic the foraging patterns. An agent architecture and environment is proposed that allows for creation of autonomous cooperative agents. The behaviour of agents is then evaluated and their intelligence is tested using an adaptation of Anytime Universal Intelligence Test
TESTING DECEPTION WITH A COMMERCIAL TOOL SIMULATING CYBERSPACE
Deception methods have been applied to the traditional domains of war (air, land, sea, and space). In the newest domain of cyber, deception can be studied to see how it can be best used. Cyberspace operations are an essential warfighting domain within the Department of Defense (DOD). Many training exercises and courses have been developed to aid leadership with planning and to execute cyberspace effects that support operations. However, only a few simulations train cyber operators about how to respond to cyberspace threats. This work tested a commercial product from Soar Technologies (Soar Tech) that simulates conflict in cyberspace. The Cyberspace Course of Action Tool (CCAT) is a decision-support tool that evaluates defensive deception in a wargame simulating a local-area network being attacked. Results showed that defensive deception methods of decoys and bait could be effective in cyberspace. This could help military cyber defenses since their digital infrastructure is threatened daily with cyberattacks.Marine Forces Cyberspace CommandChief Petty Officer, United States NavyChief Petty Officer, United States NavyApproved for public release. Distribution is unlimited
Optimal Blends of History and Intelligence for Robust Antiterrorism Policy
Abstract Antiterrorism analysis requires that security agencies blend evidence on historical patterns of terrorist behavior with incomplete intelligence on terrorist adversaries to predict possible terrorist operations and devise appropriate countermeasures. We model interactions between reactive, adaptive and intelligent adversaries embedded in minimally sufficient organizational settings to study the optimal analytic mixture, expressed as historical memory reach-back and the number of anticipatory scenarios, that should be used to design antiterrorism policy. We show that history is a valuable source of information when the terrorist organization evolves and acquires new capabilities at such a rapid pace that makes optimal strategies advocated by game-theoretic reasoning unlikely to succeed
- …