16 research outputs found

    The StarCraft Multi-Agent Challenge

    Full text link
    In the last few years, deep multi-agent reinforcement learning (RL) has become a highly active area of research. A particularly challenging class of problems in this area is partially observable, cooperative, multi-agent learning, in which teams of agents must learn to coordinate their behaviour while conditioning only on their private observations. This is an attractive research area since such problems are relevant to a large number of real-world systems and are also more amenable to evaluation than general-sum problems. Standardised environments such as the ALE and MuJoCo have allowed single-agent RL to move beyond toy domains, such as grid worlds. However, there is no comparable benchmark for cooperative multi-agent RL. As a result, most papers in this field use one-off toy problems, making it difficult to measure real progress. In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap. SMAC is based on the popular real-time strategy game StarCraft II and focuses on micromanagement challenges where each unit is controlled by an independent agent that must act based on local observations. We offer a diverse set of challenge maps and recommendations for best practices in benchmarking and evaluations. We also open-source a deep multi-agent RL learning framework including state-of-the-art algorithms. We believe that SMAC can provide a standard benchmark environment for years to come. Videos of our best agents for several SMAC scenarios are available at: https://youtu.be/VZ7zmQ_obZ0

    Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots

    Full text link
    We present Habitat 3.0: a simulation platform for studying collaborative human-robot tasks in home environments. Habitat 3.0 offers contributions across three dimensions: (1) Accurate humanoid simulation: addressing challenges in modeling complex deformable bodies and diversity in appearance and motion, all while ensuring high simulation speed. (2) Human-in-the-loop infrastructure: enabling real human interaction with simulated robots via mouse/keyboard or a VR interface, facilitating evaluation of robot policies with human input. (3) Collaborative tasks: studying two collaborative tasks, Social Navigation and Social Rearrangement. Social Navigation investigates a robot's ability to locate and follow humanoid avatars in unseen environments, whereas Social Rearrangement addresses collaboration between a humanoid and robot while rearranging a scene. These contributions allow us to study end-to-end learned and heuristic baselines for human-robot collaboration in-depth, as well as evaluate them with humans in the loop. Our experiments demonstrate that learned robot policies lead to efficient task completion when collaborating with unseen humanoid agents and human partners that might exhibit behaviors that the robot has not seen before. Additionally, we observe emergent behaviors during collaborative task execution, such as the robot yielding space when obstructing a humanoid agent, thereby allowing the effective completion of the task by the humanoid agent. Furthermore, our experiments using the human-in-the-loop tool demonstrate that our automated evaluation with humanoids can provide an indication of the relative ordering of different policies when evaluated with real human collaborators. Habitat 3.0 unlocks interesting new features in simulators for Embodied AI, and we hope it paves the way for a new frontier of embodied human-AI interaction capabilities.Comment: Project page: http://aihabitat.org/habitat

    Training Intelligent Red Team Agents Via Reinforcement Deep Learning

    Get PDF
    NPS NRP Technical ReportWargames are an essential tool for education, training and formulation of strategy. They are especially important in the evaluation of threats from, and strategies against, trained adversaries who present significant risk to friendly forces. We propose to develop a wargame adversary trained to defeat the current strategy of friendly forces, thereby allowing the evaluation of alternate strategies against an intelligent, simulated opponent. We will investigate the use of deep neural network (DNN) algorithms to solve a constrained stochastic reward-collecting path problem. Agents from a friendly (blue) team and an adversarial (red) team will be placed within a discrete environment. The blue team will be challenged to obtain a reward by achieving a fixed goal using a pre-determined strategy. Then, reinforcement learning will be used to train the red team to overcome the blue team's current strategy. Having thus trained a competent red team, the blue team's strategy can be altered to evaluate the efficacy of new strategies. This research will seek to evaluate the ability of different DNN algorithms to train the red team against various blue team strategies, in terms of both efficacy and efficiency, and the resiliency of the trained red team to subsequent changes in blue team strategy. We anticipate the results of this research to be summarized in a research poster and executive summary, in addition to a presentation and full technical report deliverable to the Topic Sponsor.Marine Corps Systems Command (MARCORSYSCOM)Marine Corps Systems Command (MARCORSYSCOM)This research is supported by funding from the Naval Postgraduate School, Naval Research Program (PE 0605853N/2098). https://nps.edu/nrpChief of Naval Operations (CNO)Approved for public release. Distribution is unlimited.

    Accelerating learning in multiagent domains through experience sharing

    Get PDF
    Dissertação (mestrado)—Universidade de Brasília, Instituto de Ciências Exatas, Departamento de Ciência da Computação, 2019.Essa dissertação contribui para o crescente campo de inteligência artificial e aprendizado de máquina. Aprendizado é um componente essencial do comportamento humano, a faculdade por trás da nossa habilidade de se adaptar. E essa característica única que diferencia seres humanos de outras espécies, e nos permitiu perserverar e dominar o mundo como nos conhecemos. Através de algoritmos de aprendizado, nós buscamos imbuir agentes artificiais com essa mesma capacidade, para que eles possam aprender e se adaptar interagindo com o ambiente, conseguindo desta forma aumentar seu potencial de atingir seus objetivos. Nesse trabalho, nós buscamos resolver o problema de como múltiplos agentes cooperativos aprendendo concomitantemente podem se beneficar de conhecimento compartilhado entre eles. A habilidade de compartilhar conhecimento adquirido, seja instantaneamente ou através de gerações, é peça chave para a nossa evolução. Segue que o compartilhamento de conhecimento entre agentes autônomos pode ser a chave para acelerar conhecimento em sistemas multiagentes cooperativos. Baseado nesse raciocínio, neste trabalho investigamos métodos de compartilhamento de conhecimento que pode efetivamente levar a uma aceleração no aprendizado. A pesquisa é focada na abordagem de transferência de conhecimento através do compartilhamento de experiências. O modelo MultiAgent Cooperative Experience Sharing (MACES) define uma arquitetura que permite troca de experiências entre agentes cooperativos aprendendo concomitantemente. Neste modelo, investigamos diferentes métodos de compartilhamento de experiências que podem levar a aceleração do aprendizado. O modelo é validado em dois problemas diferentes de aprendizado de reforço, um problema de controle clássico e um de navegação. Os resultados apresentados mostram que o MACES é capaz de reduzir em mais da metade o número de episódios necessários para completar uma tarefa através da cooperação de apenas dois agentes, comparado a agentes não cooperativos. O modelo é aplicável a agentes que implementam métodos de aprendizado de reforço profundo.This dissertation is a contribution to the burgeoning field of artificial intelligence and machine learning. Learning is a core component of human behaviour, the faculty behind our ability to adapt. It is the single characteristic that differentiate humans from other species, and has allowed us to persevere and dominate the world as we know. Through learning algorithms, we seek to imbue artificial agents with the same capacity, so they can as well learn and adapt by interacting with the environment, thus enhancing their potential to achieve their goals. In this work, we address the hard problem of how multiple cooperative agents learning concurrently to achieve a goal can benefit from sharing knowledge with each other. Key to our evolution is our ability to share learned knowledge with each other instantaneously and through generations. It follows that knowledge sharing between autonomous and independent agents could as well become the key to accelerate learning in cooperative multiagent settings. Pursuing this line of inquiry, we investigate methods of knowledge sharing that can effectively lead to faster learning. We focus on the approach of transferring knowledge by experience sharing. The proposed MultiAgent Cooperative Experience Sharing (MACES) model defines an architecture that allows experience sharing between concurrently learning cooperative agents. Within MACES, we investigate different methods of experience sharing that can lead to accelerated learning. The proposed model is validated in two different reinforcement learning settings, a classical control and a navigation problem. The results shows that MACES is able to reduce in over a half the number of episodes required to complete a task through cooperation of only two agents, compared to a single agent baseline. The model is applicable to deep reinforcement learning agents
    corecore