21 research outputs found
Cooperative Pursuit with Multi-Pursuer and One Faster Free-moving Evader
This paper addresses a multi-pursuer single-evader pursuit-evasion game where
the free-moving evader moves faster than the pursuers. Most of the existing
works impose constraints on the faster evader such as limited moving area and
moving direction. When the faster evader is allowed to move freely without any
constraint, the main issues are how to form an encirclement to trap the evader
into the capture domain, how to balance between forming an encirclement and
approaching the faster evader, and what conditions make the capture possible.
In this paper, a distributed pursuit algorithm is proposed to enable pursuers
to form an encirclement and approach the faster evader. An algorithm that
balances between forming an encirclement and approaching the faster evader is
proposed. Moreover, sufficient capture conditions are derived based on the
initial spatial distribution and the speed ratios of the pursuers and the
evader. Simulation and experimental results on ground robots validate the
effectiveness and practicability of the proposed method
The Barrier Surface in the Cooperative Football Differential Game
This paper considers the blocking or football pursuit-evasion differential
game. Two pursuers cooperate and try to capture the ball carrying evader as far
as possible from the goal line. The evader wishes to be as close as possible to
the goal line at the time of capture and, if possible, reach the line. In this
paper the solution of the game of kind is provided: The Barrier surface that
partitions the state space into two winning sets, one for the pursuer team and
one for the evader, is constructed. Under optimal play, the winning team is
determined by evaluating the associated Barrier function.Comment: 5 pages, 1 figur
Capturing an Evader Using Multiple Pursuers with Sensing Limitations in Convex Environment
A modified continuous-time pursuit-evasion game with multiple pursuers and a single evader is studied. The game has been played in an obstacle-free convex environment which consists an exit gate through which the evader may escape. The geometry of the convex is unknown to all players except pursuers know the location of the exit gate and they can communicate with each other. All players have equal maximum velocities and identical sensing range. An evader is navigating inside the environment and seeking the exit gate to win the game. A novel sweep-pursuit-capture strategy for the pursuers to search and capture the evader under some necessary and sufficient conditions is presented. We also show that three pursuers are sufficient to finish the operation successfully. Non-holonomic wheeled mobile robots of the same configurations have been used as the pursuers and the evader. Simulation studies demonstrate the performance of the proposed strategy in terms of interception time and the distance traveled by the players.
Nowhere to Go: Benchmarking Multi-robot Collaboration in Target Trapping Environment
Collaboration is one of the most important factors in multi-robot systems.
Considering certain real-world applications and to further promote its
development, we propose a new benchmark to evaluate multi-robot collaboration
in Target Trapping Environment (T2E). In T2E, two kinds of robots (called
captor robot and target robot) share the same space. The captors aim to catch
the target collaboratively, while the target will try to escape from the trap.
Both the trapping and escaping process can use the environment layout to help
achieve the corresponding objective, which requires high collaboration between
robots and the utilization of the environment. For the benchmark, we present
and evaluate multiple learning-based baselines in T2E, and provide insights
into regimes of multi-robot collaboration. We also make our benchmark publicly
available and encourage researchers from related robotics disciplines to
propose, evaluate, and compare their solutions in this benchmark. Our project
is released at https://github.com/Dr-Xiaogaren/T2E
DACOOP-A: Decentralized Adaptive Cooperative Pursuit via Attention
Integrating rule-based policies into reinforcement learning promises to
improve data efficiency and generalization in cooperative pursuit problems.
However, most implementations do not properly distinguish the influence of
neighboring robots in observation embedding or inter-robot interaction rules,
leading to information loss and inefficient cooperation. This paper proposes a
cooperative pursuit algorithm named Decentralized Adaptive COOperative Pursuit
via Attention (DACOOP-A) by empowering reinforcement learning with artificial
potential field and attention mechanisms. An attention-based framework is
developed to emphasize important neighbors by concurrently integrating the
learned attention scores into observation embedding and inter-robot interaction
rules. A KL divergence regularization is introduced to alleviate the resultant
learning stability issue. Improvements in data efficiency and generalization
are demonstrated through numerical simulations. Extensive quantitative analysis
and ablation studies are performed to illustrate the advantages of the proposed
modules. Real-world experiments are performed to justify the feasibility of
deploying DACOOP-A in physical systems.Comment: 8 Pages; This manuscript has been accepted by IEEE Robotics and
Automation Letter
Deep Reinforcement Learning for Swarm Systems
Recently, deep reinforcement learning (RL) methods have been applied
successfully to multi-agent scenarios. Typically, these methods rely on a
concatenation of agent states to represent the information content required for
decentralized decision making. However, concatenation scales poorly to swarm
systems with a large number of homogeneous agents as it does not exploit the
fundamental properties inherent to these systems: (i) the agents in the swarm
are interchangeable and (ii) the exact number of agents in the swarm is
irrelevant. Therefore, we propose a new state representation for deep
multi-agent RL based on mean embeddings of distributions. We treat the agents
as samples of a distribution and use the empirical mean embedding as input for
a decentralized policy. We define different feature spaces of the mean
embedding using histograms, radial basis functions and a neural network learned
end-to-end. We evaluate the representation on two well known problems from the
swarm literature (rendezvous and pursuit evasion), in a globally and locally
observable setup. For the local setup we furthermore introduce simple
communication protocols. Of all approaches, the mean embedding representation
using neural network features enables the richest information exchange between
neighboring agents facilitating the development of more complex collective
strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20