317 research outputs found

    Dual-Mandate Patrols: Multi-Armed Bandits for Green Security

    Full text link
    Conservation efforts in green security domains to protect wildlife and forests are constrained by the limited availability of defenders (i.e., patrollers), who must patrol vast areas to protect from attackers (e.g., poachers or illegal loggers). Defenders must choose how much time to spend in each region of the protected area, balancing exploration of infrequently visited regions and exploitation of known hotspots. We formulate the problem as a stochastic multi-armed bandit, where each action represents a patrol strategy, enabling us to guarantee the rate of convergence of the patrolling policy. However, a naive bandit approach would compromise short-term performance for long-term optimality, resulting in animals poached and forests destroyed. To speed up performance, we leverage smoothness in the reward function and decomposability of actions. We show a synergy between Lipschitz-continuity and decomposition as each aids the convergence of the other. In doing so, we bridge the gap between combinatorial and Lipschitz bandits, presenting a no-regret approach that tightens existing guarantees while optimizing for short-term performance. We demonstrate that our algorithm, LIZARD, improves performance on real-world poaching data from Cambodia.Comment: Published at AAAI 2021. 9 pages (paper and references), 3 page appendix. 6 figures and 1 tabl

    On the Inducibility of Stackelberg Equilibrium for Security Games

    Full text link
    Strong Stackelberg equilibrium (SSE) is the standard solution concept of Stackelberg security games. As opposed to the weak Stackelberg equilibrium (WSE), the SSE assumes that the follower breaks ties in favor of the leader and this is widely acknowledged and justified by the assertion that the defender can often induce the attacker to choose a preferred action by making an infinitesimal adjustment to her strategy. Unfortunately, in security games with resource assignment constraints, the assertion might not be valid; it is possible that the defender cannot induce the desired outcome. As a result, many results claimed in the literature may be overly optimistic. To remedy, we first formally define the utility guarantee of a defender strategy and provide examples to show that the utility of SSE can be higher than its utility guarantee. Second, inspired by the analysis of leader's payoff by Von Stengel and Zamir (2004), we provide the solution concept called the inducible Stackelberg equilibrium (ISE), which owns the highest utility guarantee and always exists. Third, we show the conditions when ISE coincides with SSE and the fact that in general case, SSE can be extremely worse with respect to utility guarantee. Moreover, introducing the ISE does not invalidate existing algorithmic results as the problem of computing an ISE polynomially reduces to that of computing an SSE. We also provide an algorithmic implementation for computing ISE, with which our experiments unveil the empirical advantage of the ISE over the SSE.Comment: The Thirty-Third AAAI Conference on Artificial Intelligenc

    AI for Social Impact: Learning and Planning in the Data-to-Deployment Pipeline

    Full text link
    With the maturing of AI and multiagent systems research, we have a tremendous opportunity to direct these advances towards addressing complex societal problems. In pursuit of this goal of AI for Social Impact, we as AI researchers must go beyond improvements in computational methodology; it is important to step out in the field to demonstrate social impact. To this end, we focus on the problems of public safety and security, wildlife conservation, and public health in low-resource communities, and present research advances in multiagent systems to address one key cross-cutting challenge: how to effectively deploy our limited intervention resources in these problem domains. We present case studies from our deployments around the world as well as lessons learned that we hope are of use to researchers who are interested in AI for Social Impact. In pushing this research agenda, we believe AI can indeed play an important role in fighting social injustice and improving society.Comment: To appear, AI Magazin

    Artificial intelligence for social impact: Learning and planning in the data-to-deployment pipeline

    Get PDF
    With the maturing of artificial intelligence (AI) and multiagent systems research, we have a tremendous opportunity to direct these advances toward addressing complex societal problems. In pursuit of this goal of AI for social impact, we as AI researchers must go beyond improvements in computational methodology; it is important to step out in the field to demonstrate social impact. To this end, we focus on the problems of public safety and security, wildlife conservation, and public health in low-resource communities, and present research advances in multiagent systems to address one key cross-cutting challenge: how to effectively deploy our limited intervention resources in these problem domains. We present case studies from our deployments around the world as well as lessons learned that we hope are of use to researchers who are interested in AI for social impact. In pushing this research agenda, we believe AI can indeed play an important role in fighting social injustice and improving society

    Allocating patrolling resources to effectively thwart intelligent attackers

    Get PDF
    This thesis considers the allocation of patrolling resources deployed in an effort to thwart intelligent attackers, who are committing malicious acts at unknown locations which take a specified length of time to complete. This thesis considers patrolling games which depend on three parameters; a graph, a game length and an attack length. For patrolling games, the graph models the locations and how they are connected, the game length corresponds to the time-horizon in which two players, known as the patroller and attacker, act and the attack length is the time it takes an attacker to complete their malicious act. This thesis defines patrolling games (as first seen in [16]) and explains its known properties and how such games are solved. While any patrolling game can be solved by a linear program (LP) when the number of locations or game length is small, this becomes infeasible when either of these parameters are of moderate size. Therefore, strategies are often evaluated by knowing an opponent’s response and with this, patroller and attacker strategies give lower and upper bounds on the optimal value. Moreover, when tight bounds are given by strategies these are optimal strategies. This thesis states known strategies giving these bounds and classes for which patrolling games have been solved. Firstly, this thesis introduces new techniques which can be used to evaluate strategies, by reducing the strategy space for best responses from an opponent. Extensions to known strategies are developed and their respective bounds are given using known results. In addition we develop a patroller improvement program (PIP) which improves current patroller strategies by considering which locations are currently under performing. Secondly, these general techniques and strategies are applied to find solutions to a certain class of patrolling games which are not previously solved. In particular, classes of the patrolling game are solved when the graph is multipartite or is an extension of a star graph. Thirdly, this thesis conjectures that a developed patroller strategy known as the random minimal full-node cycle is optimal for a large class of patrolling games, when the graph is a tree. Intuitive reasoning behind the conjecture is given along with computational evidence, showing the conjecture holds when the number of locations in the graph is less than 9. Finally, this thesis looks at three extensions to the scenario modelled by the patrolling game. One extension models varying distances between locations rather than assuming locations are a unitary distance apart. Another extension allows the time needed for an attacker to complete their malicious act to vary depending on the vulnerability of the location. For the final extension of multiple players we look at four variants depending on how multiple attackers succeed in the extension. In each extension we find some properties of the game and show that it possible to relate # extensions to the classic patrolling game in order to find the value and optimal strategies for certain classes of such games

    Allocating patrolling resources to effectively thwart intelligent attackers

    Get PDF
    This thesis considers the allocation of patrolling resources deployed in an effort to thwart intelligent attackers, who are committing malicious acts at unknown locations which take a specified length of time to complete. This thesis considers patrolling games which depend on three parameters; a graph, a game length and an attack length. For patrolling games, the graph models the locations and how they are connected, the game length corresponds to the time-horizon in which two players, known as the patroller and attacker, act and the attack length is the time it takes an attacker to complete their malicious act. This thesis defines patrolling games (as first seen in [16]) and explains its known properties and how such games are solved. While any patrolling game can be solved by a linear program (LP) when the number of locations or game length is small, this becomes infeasible when either of these parameters are of moderate size. Therefore, strategies are often evaluated by knowing an opponent’s response and with this, patroller and attacker strategies give lower and upper bounds on the optimal value. Moreover, when tight bounds are given by strategies these are optimal strategies. This thesis states known strategies giving these bounds and classes for which patrolling games have been solved. Firstly, this thesis introduces new techniques which can be used to evaluate strategies, by reducing the strategy space for best responses from an opponent. Extensions to known strategies are developed and their respective bounds are given using known results. In addition we develop a patroller improvement program (PIP) which improves current patroller strategies by considering which locations are currently under performing. Secondly, these general techniques and strategies are applied to find solutions to a certain class of patrolling games which are not previously solved. In particular, classes of the patrolling game are solved when the graph is multipartite or is an extension of a star graph. Thirdly, this thesis conjectures that a developed patroller strategy known as the random minimal full-node cycle is optimal for a large class of patrolling games, when the graph is a tree. Intuitive reasoning behind the conjecture is given along with computational evidence, showing the conjecture holds when the number of locations in the graph is less than 9. Finally, this thesis looks at three extensions to the scenario modelled by the patrolling game. One extension models varying distances between locations rather than assuming locations are a unitary distance apart. Another extension allows the time needed for an attacker to complete their malicious act to vary depending on the vulnerability of the location. For the final extension of multiple players we look at four variants depending on how multiple attackers succeed in the extension. In each extension we find some properties of the game and show that it possible to relate # extensions to the classic patrolling game in order to find the value and optimal strategies for certain classes of such games

    Coevolutionary algorithms for the optimization of strategies for red teaming applications

    Get PDF
    Red teaming (RT) is a process that assists an organization in finding vulnerabilities in a system whereby the organization itself takes on the role of an “attacker” to test the system. It is used in various domains including military operations. Traditionally, it is a manual process with some obvious weaknesses: it is expensive, time-consuming, and limited from the perspective of humans “thinking inside the box”. Automated RT is an approach that has the potential to overcome these weaknesses. In this approach both the red team (enemy forces) and blue team (friendly forces) are modelled as intelligent agents in a multi-agent system and the idea is to run many computer simulations, pitting the plan of the red team against the plan of blue team. This research project investigated techniques that can support automated red teaming by conducting a systematic study involving a genetic algorithm (GA), a basic coevolutionary algorithm and three variants of the coevolutionary algorithm. An initial pilot study involving the GA showed some limitations, as GAs only support the optimization of a single population at a time against a fixed strategy. However, in red teaming it is not sufficient to consider just one, or even a few, opponent‟s strategies as, in reality, each team needs to adjust their strategy to account for different strategies that competing teams may utilize at different points. Coevolutionary algorithms (CEAs) were identified as suitable algorithms which were capable of optimizing two teams simultaneously for red teaming. The subsequent investigation of CEAs examined their performance in addressing the characteristics of red teaming problems, such as intransitivity relationships and multimodality, before employing them to optimize two red teaming scenarios. A number of measures were used to evaluate the performance of CEAs and in terms of multimodality, this study introduced a novel n-peak problem and a new performance measure based on the Circular Earth Movers‟ Distance. Results from the investigations involving an intransitive number problem, multimodal problem and two red teaming scenarios showed that in terms of the performance measures used, there is not a single algorithm that consistently outperforms the others across the four test problems. Applications of CEAs on the red teaming scenarios showed that all four variants produced interesting evolved strategies at the end of the optimization process, as well as providing evidence of the potential of CEAs in their future application in red teaming. The developed techniques can potentially be used for red teaming in military operations or analysis for protection of critical infrastructure. The benefits include the modelling of more realistic interactions between the teams, the ability to anticipate and to counteract potentially new types of attacks as well as providing a cost effective solution
    • …
    corecore