Search CORE

249 research outputs found

Multiagent Cooperative Learning Strategies for Pursuit-Evasion Games

Author: Fang-Wen Lee
Hsiang-Fu Yu
Jong Yih Kuo
Kevin Fong-Rey Liu
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

This study examines the pursuit-evasion problem for coordinating multiple robotic pursuers to locate and track a nonadversarial mobile evader in a dynamic environment. Two kinds of pursuit strategies are proposed, one for agents that cooperate with each other and the other for agents that operate independently. This work further employs the probabilistic theory to analyze the uncertain state information about the pursuers and the evaders and uses case-based reasoning to equip agents with memories and learning abilities. According to the concepts of assimilation and accommodation, both positive-angle and bevel-angle strategies are developed to assist agents in adapting to their environment effectively. The case study analysis uses the Recursive Porous Agent Simulation Toolkit (REPAST) to implement a multiagent system and demonstrates superior performance of the proposed approaches to the pursuit-evasion game

Crossref

Directory of Open Access Journals

Model-Predictive Strategy Generation for Multi-Agent Pursuit-Evasion Games

Author: Raboin Eric James
Publication venue
Publication date: 01/01/2015
Field of study

Multi-agent pursuit-evasion games can be used to model a variety of different real world problems including surveillance, search-and-rescue, and defense-related scenarios. However, many pursuit-evasion problems are computationally difficult, which can be problematic for domains with complex geometry or large numbers of agents. To compound matters further, practical applications often require planning methods to operate under high levels of uncertainty or meet strict running-time requirements. These challenges strongly suggest that heuristic methods are needed to address pursuit-evasion problems in the real world. In this dissertation I present heuristic planning techniques for three related problem domains: visibility-based pursuit-evasion, target following with differential motion constraints, and distributed asset guarding with unmanned sea-surface vehicles. For these domains, I demonstrate that heuristic techniques based on problem relaxation and model-predictive simulation can be used to efficiently perform low-level control action selection, motion goal selection, and high-level task allocation. In particular, I introduce a polynomial-time algorithm for control action selection in visibility-based pursuit-evasion games, where a team of pursuers must minimize uncertainty about the location of an evader. The algorithm uses problem relaxation to estimate future states of the game. I also show how to incorporate a probabilistic opponent model learned from interaction traces of prior games into the algorithm. I verify experimentally that by performing Monte Carlo sampling over the learned model to estimate the location of the evader, the algorithm performs better than existing planning approaches based on worst-case analysis. Next, I introduce an algorithm for motion goal selection in pursuit-evasion scenarios with unmanned boats. I show how a probabilistic model accounting for differential motion constraints can be used to project the future positions of the target boat. Motion goals for the pursuer boat can then be selected based on those projections. I verify experimentally that motion goals selected with this technique are better optimized for travel time and proximity to the target boat when compared to motion goals selected based on the current position of the target boat. Finally, I introduce a task-allocation technique for a team of unmanned sea-surface vehicles (USVs) responsible for guarding a high-valued asset. The team of USVs must intercept and block a set of hostile intruder boats before they reach the asset. The algorithm uses model-predictive simulation to estimate the value of high-level task assignments, which are then realized by a set of learned low-level behaviors. I show experimentally that using model-predictive simulations based on Monte-Carlo sampling is more effective than hand-coded evaluation heuristics

Digital Repository at the University of Maryland

Robot Planning in Adversarial Environments Using Tree Search Techniques

Author: Zhang Zhongshun
Publication venue
Publication date: 01/01/2021
Field of study

One of the main advantages of robots is that they can be used in environments that are dangerous for humans. Robots can not only be used for tasks in known and safe areas but also in environments that may have adversaries. When planning the robot's actions in such scenarios, we have to consider the outcomes of a robot's actions based on the actions taken by the adversary, as well as the information available to the robot and the adversary. The goal of this dissertation is to design planning strategies that improve the robot's performance in adversarial environments. Specifically, we study how the availability of information affects the planning process and the outcome. We also study how to improve the computational efficiency by exploiting the structural properties of the underlying setting. We adopt a game-theoretic formulation and study two scenarios: adversarial active target tracking and reconnaissance in environments with adversaries. A conservative approach is to plan the robot's action assuming a worst-case adversary with complete knowledge of the robot's state and objective. We start with such a "symmetric" information game for the adversarial target tracking scenario with noisy sensing. By using the properties of the Kalman filter, we design a pruning strategy to improve the efficiency of a tree search algorithm. We investigate the performance limits of the asymmetric version where the adversary can inject false sensing data. We then study a reconnaissance scenario where the robot and the adversary have symmetric information. We design an algorithm that allows a robot to scan more area while avoiding being detected by the adversary. The symmetric adversarial model may yield too conservative plans when the adversary may not have the same information as the robot. Furthermore, the information available to the adversary may change during execution. We then investigate the dynamic version of this asymmetric information game and show how much the robot can exploit the asymmetry in information using tree search techniques. Specifically, we study scenarios where the information available to the adversary changes during execution. We devise a new algorithm for this asymmetric information game with theoretical performance guarantees and evaluate those approaches through experiments. We use qualitative examples to show how the new algorithm can outperform symmetric minimax and use quantitative experiments to show how much the improvement is

Digital Repository at the University of Maryland

Evolving Effective Micro Behaviors for Real-Time Strategy Games

Author: Liu Siming
Publication venue
Publication date: 03/11/2017
Field of study

Real-Time Strategy games have become a new frontier of artificial intelligence research. Advances in real-time strategy game AI, like with chess and checkers before, will significantly advance the state of the art in AI research. This thesis aims to investigate using heuristic search algorithms to generate effective micro behaviors in combat scenarios for real-time strategy games. Macro and micro management are two key aspects of real-time strategy games. While good macro helps a player collect more resources and build more units, good micro helps a player win skirmishes against equal numbers of opponent units or win even when outnumbered. In this research, we use influence maps and potential fields as a basis representation to evolve micro behaviors. We first compare genetic algorithms against two types of hill climbers for generating competitive unit micro management. Second, we investigated the use of case-injected genetic algorithms to quickly and reliably generate high quality micro behaviors. Then we compactly encoded micro behaviors including influence maps, potential fields, and reactive control into fourteen parameters and used genetic algorithms to search for a complete micro bot, ECSLBot. We compare the performance of our ECSLBot with two state of the art bots, UAlbertaBot and Nova, on several skirmish scenarios in a popular real-time strategy game StarCraft. The results show that the ECSLBot tuned by genetic algorithms outperforms UAlbertaBot and Nova in kiting efficiency, target selection, and fleeing. In addition, the same approach works to create competitive micro behaviors in another game SeaCraft. Using parallelized genetic algorithms to evolve parameters in SeaCraft we are able to speed up the evolutionary process from twenty one hours to nine minutes. We believe this work provides evidence that genetic algorithms and our representation may be a viable approach to creating effective micro behaviors for winning skirmishes in real-time strategy games

University of Nevada, Reno ScholarWorks Repository

Multi-agent persistent surveillance under temporal logic constraints

Author: Leahy Kevin
Publication venue
Publication date: 10/03/2017
Field of study

This thesis proposes algorithms for the deployment of multiple autonomous agents for persistent surveillance missions requiring repeated, periodic visits to regions of interest. Such problems arise in a variety of domains, such as monitoring ocean conditions like temperature and algae content, performing crowd security during public events, tracking wildlife in remote or dangerous areas, or watching traffic patterns and road conditions. Using robots for surveillance is an attractive solution to scenarios in which fixed sensors are not sufficient to maintain situational awareness. Multi-agent solutions are particularly promising, because they allow for improved spatial and temporal resolution of sensor information. In this work, we consider persistent monitoring by teams of agents that are tasked with satisfying missions specified using temporal logic formulas. Such formulas allow rich, complex tasks to be specified, such as "visit regions A and B infinitely often, and if region C is visited then go to region D, and always avoid obstacles." The agents must determine how to satisfy such missions according to fuel, communication, and other constraints. Such problems are inherently difficult due to the typically infinite horizon, state space explosion from planning for multiple agents, communication constraints, and other issues. Therefore, computing an optimal solution to these problems is often infeasible. Instead, a balance must be struck between computational complexity and optimality. This thesis describes solution methods for two main classes of multi-agent persistent surveillance problems. First, it considers the class of problems in which persistent surveillance goals are captured entirely by TL constraints. Such problems require agents to repeatedly visit a set of surveillance regions in order to satisfy their mission. We present results for agents solving such missions with charging constraints, with noisy observations, and in the presence of adversaries. The second class of problems include an additional optimality criterion, such as minimizing uncertainty about the location of a target or maximizing sensor information among the team of agents. We present solution methods and results for such missions with a variety of optimality criteria based on information metrics. For both classes of problems, the proposed algorithms are implemented and evaluated via simulation, experiments with robots in a motion capture environment, or both

Boston University Institutional Repository (OpenBU)

Recommended from our members

Multi-agent algorithms with assignment strategy pursuing multiple moving targets in dynamic environments

Author: Afzalov A
Publication venue
Publication date: 01/11/2022
Field of study

Devising intelligent agents to successfully plan a path to a target is a common problem in artificial intelligence and in recent years, attention has increased to multi-agent pathfinding problems, especially due to the expansion in computer video games and robotics. Pathfinding for agents in real-world applications is a defined problem of multi-agent systems, where pursuing agents collaborate among themselves and autonomously plan their path to the targets. There are multi-agent algorithms that provide solutions with the shortest path without considering other pursuers and several of those use coordination. However, less attention has been paid to computing an assignment strategy for the pursuers and finding paths that collectively surround the targets. Comparatively fewer studies have been on target algorithms either. Besides, the multi-agent pathfinding problem becomes even more challenging if the goal destinations change over time. Existing solutions consider either a single target with moving capability or multiple targets that are stationary. The work presented in this thesis considers multiple moving targets in multi-agent systems. Therefore, the path planning problem for multiple pursuing agents requires more efficient pathfinding algorithms. In addition, when the target algorithms are improved for advanced behaviour with moving capabilities that smartly evade the pursuers makes the problem even harder. The research reported in this thesis aims to investigate multi-agent search algorithms to address the challenge associated with pursuing agents towards moving targets within a dynamically changing environment. In multi-agent scenarios, agents compute a path towards the target, while these target destinations in some cases are predefined in advance. Thus, this research proposes to investigate a solution to the path planning problem by utilising heuristic algorithms as well as assignment strategies for multiple pursuing agents. Furthermore, a state-of-the-art moving target algorithm, TrailMax, has been enhanced and implemented for multiple agent pathfinding problems, which aims to maximise the capture time if possible until timeout. The focus of this thesis is the investigation of the assignment strategy algorithms to coordinate multiple pursuing agents and explore pathfinding search algorithms to find a route towards moving targets. This will be achieved by dividing it into two stages. The first one is the coupled approach where the assignment strategy with a given criterion finds the optimal combination based on the current position of players. The second stage is the decoupled approach, where each agent independently finds its path towards the moving target. On the other hand, targets flee from pursuing agents using the specified escaping strategy. The novel contributions of the research presented in this thesis are summarised as follows: - A new algorithm is developed that uses existing assignment strategies, sum-of-costs and makespan, to assign targets, and then runs repetitive A* search until reaches the target. - An enhancement is provided for a state-of-the-art target algorithm that takes smart moves by avoiding capture from all pursuers. - To improve efficiency, six new approaches are investigated to find an optimal agent-to-target combination for target assignment. - A novel multi-agent algorithm is developed which uses cover heuristics to maximise its coverage to outmanoeuvre, trap and catch moving targets. The proposed pathfinding solutions and the results presented in this thesis demonstrate a significant contribution towards search algorithms in multi-agent systems

Nottingham Trent Institutional Repository (IRep)

Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability

Author: Maree Charl
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities. Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio

Agder University Research Archive

A METHODOLOGY FOR TECHNOLOGY-TUNED DECISION BEHAVIOR ALGORITHMS FOR TACTICS EXPLORATION

Author: Hull Andrew K.
Publication venue: Georgia Institute of Technology
Publication date: 14/01/2022
Field of study

In 2016, the USAF found that current development and acquisition methods may be inadequate to achieve air superiority in 2030. The airspace is expected to be highly contested by 2030 due to the Anti-Access/Area Denial strategies being employed by adversaries. Capability gaps must be addressed in order to maintain air superiority. The USAF identified new development and acquisition paradigms as the number one non-material capability development area. The idea of a new development and acquisition paradigm is not new. Such a paradigm shift occurred during the transition from threat-based acquisition during the cold war to capability-based acquisition during the war on terror. Investigation into current US development and acquisition methods found several notional methodologies. Effectiveness-Based Design and Technology Identification, Evaluation, and Selection for Systems-of-Systems have been proposed as notional solutions. Both methodologies seek to evaluate the means – the technologies used to perform a mission – and the ways – the tactics used to complete a mission – of the technology design space. Proper evaluation of the ways would provide critical information to the decision-maker during technology selection. These findings suggest that a new paradigm focused on effectiveness-based acquisition is needed to improve current development and acquisition methods. To evaluate the ways design space, current methods must move away from a fixed or constrained mission model to one that is minimally defined and capable of exploring tactics for each unique technology. The proposed Technology-tuned Decision Behavior Algorithms for Tactics Exploration (Tech-DEBATE) methodology enables the exploration of the ways, or more formally, the mission action design space. The methodology enables further exploration of the technology design space by improving the quantification of mission effectiveness through deep reinforcement learning in a minimally defined mission environment. The data's foundation is based on traceable tactical alternatives that increase the confidence in the measures of effectiveness for each technology-tactic alternative. The methodology enables more informed decisions for technology investment, thereby reducing risks in the development and acquisition of new technologies. The reduction in risk inherently reduces the costs and development time associated with investment in new technologies. The Tech-DEBATE methodology provides a new methodology for technology evaluation through its emphasis on quantifying mission effectiveness in a minimally defined mission to inform technology investment decisions.Ph.D

Scholarly Materials And Research @ Georgia Tech