The comprehension of how local interactions arise in global collective
behavior is of utmost importance in both biological and physical research.
Traditional agent-based models often rely on static rules that fail to capture
the dynamic strategies of the biological world. Reinforcement learning has been
proposed as a solution, but most previous methods adopt handcrafted reward
functions that implicitly or explicitly encourage the emergence of swarming
behaviors. In this study, we propose a minimal predator-prey coevolution
framework based on mixed cooperative-competitive multiagent reinforcement
learning, and adopt a reward function that is solely based on the fundamental
survival pressure, that is, prey receive a reward of −1 if caught by
predators while predators receive a reward of +1. Surprisingly, our analysis
of this approach reveals an unexpectedly rich diversity of emergent behaviors
for both prey and predators, including flocking and swirling behaviors for
prey, as well as dispersion tactics, confusion, and marginal predation
phenomena for predators. Overall, our study provides novel insights into the
collective behavior of organisms and highlights the potential applications in
swarm robotics