6 research outputs found
Multi-Agent Game Abstraction via Graph Attention Neural Network
In large-scale multi-agent systems, the large number of agents and complex
game relationship cause great difficulty for policy learning. Therefore,
simplifying the learning process is an important research issue. In many
multi-agent systems, the interactions between agents often happen locally,
which means that agents neither need to coordinate with all other agents nor
need to coordinate with others all the time. Traditional methods attempt to use
pre-defined rules to capture the interaction relationship between agents.
However, the methods cannot be directly used in a large-scale environment due
to the difficulty of transforming the complex interactions between agents into
rules. In this paper, we model the relationship between agents by a complete
graph and propose a novel game abstraction mechanism based on two-stage
attention network (G2ANet), which can indicate whether there is an interaction
between two agents and the importance of the interaction. We integrate this
detection mechanism into graph neural network-based multi-agent reinforcement
learning for conducting game abstraction and propose two novel learning
algorithms GA-Comm and GA-AC. We conduct experiments in Traffic Junction and
Predator-Prey. The results indicate that the proposed methods can simplify the
learning process and meanwhile get better asymptotic performance compared with
state-of-the-art algorithms.Comment: Accepted by AAAI202
Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning
Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize
reward but do not have safety guarantees during the learning and deployment
phases. Although shielding with Linear Temporal Logic (LTL) is a promising
formal method to ensure safety in single-agent Reinforcement Learning (RL), it
results in conservative behaviors when scaling to multi-agent scenarios.
Additionally, it poses computational challenges for synthesizing shields in
complex multi-agent environments. This work introduces Model-based Dynamic
Shielding (MBDS) to support MARL algorithm design. Our algorithm synthesizes
distributive shields, which are reactive systems running in parallel with each
MARL agent, to monitor and rectify unsafe behaviors. The shields can
dynamically split, merge, and recompute based on agents' states. This design
enables efficient synthesis of shields to monitor agents in complex
environments without coordination overheads. We also propose an algorithm to
synthesize shields without prior knowledge of the dynamics model. The proposed
algorithm obtains an approximate world model by interacting with the
environment during the early stage of exploration, making our MBDS enjoy formal
safety guarantees with high probability. We demonstrate in simulations that our
framework can surpass existing baselines in terms of safety guarantees and
learning performance.Comment: Accepted in AAMAS 202