2 research outputs found
Cooperative Evasion by Translating Targets with Variable Speeds
We consider a problem of cooperative evasion between a single pursuer and
multiple evaders in which the evaders are constrained to move in the positive Y
direction. The evaders are slower than the vehicle and can choose their speeds
from a bounded interval. The pursuer aims to intercept all evaders in a given
sequence by executing a Manhattan pursuit strategy of moving parallel to the X
axis, followed by moving parallel to the Y axis. The aim of the evaders is to
cooperatively pick their individual speeds so that the total time to intercept
all evaders is maximized. We first obtain conditions under which evaders should
cooperate in order to maximize the total time to intercept as opposed to each
moving greedily to optimize its own intercept time. Then, we propose and
analyze an algorithm that assigns evasive strategies to the evaders in two
iterations as opposed to performing an exponential search over the choice of
evader speeds. We also characterize a fundamental limit on the total time taken
by the pursuer to capture all evaders when the number of evaders is large.
Finally, we provide numerical comparisons against random sampling heuristics
Min-Max Q-Learning for Multi-Player Pursuit-Evasion Games
In this paper, we address a pursuit-evasion game involving multiple players
by utilizing tools and techniques from reinforcement learning and matrix game
theory. In particular, we consider the problem of steering an evader to a goal
destination while avoiding capture by multiple pursuers, which is a
high-dimensional and computationally intractable problem in general. In our
proposed approach, we first formulate the multi-agent pursuit-evasion game as a
sequence of discrete matrix games. Next, in order to simplify the solution
process, we transform the high-dimensional state space into a low-dimensional
manifold and the continuous action space into a feature-based space, which is a
discrete abstraction of the original space. Based on these transformed state
and action spaces, we subsequently employ min-max Q-learning, to generate the
entries of the payoff matrix of the game, and subsequently obtain the optimal
action for the evader at each stage. Finally, we present extensive numerical
simulations to evaluate the performance of the proposed learning-based evading
strategy in terms of the evader's ability to reach the desired target location
without being captured, as well as computational efficiency