10,245 research outputs found

    Robustness to Multi-Modal Environment Uncertainty in MARL using Curriculum Learning

    Full text link
    Multi-agent reinforcement learning (MARL) plays a pivotal role in tackling real-world challenges. However, the seamless transition of trained policies from simulations to real-world requires it to be robust to various environmental uncertainties. Existing works focus on finding Nash Equilibrium or the optimal policy under uncertainty in one environment variable (i.e. action, state or reward). This is because a multi-agent system itself is highly complex and unstationary. However, in real-world situation uncertainty can occur in multiple environment variables simultaneously. This work is the first to formulate the generalised problem of robustness to multi-modal environment uncertainty in MARL. To this end, we propose a general robust training approach for multi-modal uncertainty based on curriculum learning techniques. We handle two distinct environmental uncertainty simultaneously and present extensive results across both cooperative and competitive MARL environments, demonstrating that our approach achieves state-of-the-art levels of robustness

    Decentralized Language Learning Through Acting

    Get PDF
    This paper presents an algorithm for learning the meaning of messages communicated between agents that interact while acting optimally towards a cooperative goal. Our reinforcement-learning method is based on Bayesian filtering and has been adapted for a decentralized control process. Empirical results shed light on the complexity of the learning problem, and on factors affecting the speed of convergence. Designing intelligent agents able to adapt their mutual interpretation of messages exchanged, in order to improve overall task-oriented performance, introduces an essential cognitive capability that can upgrade the current state of the art in multi-agent and human-machine systems to the next level. Learning to communicate while acting will add to the robustness and flexibility of these systems and hence to a more efficient and productive performance

    Maximum Entropy Heterogeneous-Agent Mirror Learning

    Full text link
    Multi-agent reinforcement learning (MARL) has been shown effective for cooperative games in recent years. However, existing state-of-the-art methods face challenges related to sample inefficiency, brittleness regarding hyperparameters, and the risk of converging to a suboptimal Nash Equilibrium. To resolve these issues, in this paper, we propose a novel theoretical framework, named Maximum Entropy Heterogeneous-Agent Mirror Learning (MEHAML), that leverages the maximum entropy principle to design maximum entropy MARL actor-critic algorithms. We prove that algorithms derived from the MEHAML framework enjoy the desired properties of the monotonic improvement of the joint maximum entropy objective and the convergence to quantal response equilibrium (QRE). The practicality of MEHAML is demonstrated by developing a MEHAML extension of the widely used RL algorithm, HASAC (for soft actor-critic), which shows significant improvements in exploration and robustness on three challenging benchmarks: Multi-Agent MuJoCo, StarCraftII, and Google Research Football. Our results show that HASAC outperforms strong baseline methods such as HATD3, HAPPO, QMIX, and MAPPO, thereby establishing the new state of the art. See our project page at https://sites.google.com/view/mehaml
    • …
    corecore