49 research outputs found

    A Survey and Critique of Multiagent Deep Reinforcement Learning

    Full text link
    Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be addressed. The primary goal of this article is to provide a clear overview of current multiagent deep reinforcement learning (MDRL) literature. Additionally, we complement the overview with a broader analysis: (i) we revisit previous key components, originally presented in MAL and RL, and highlight how they have been adapted to multiagent deep reinforcement learning settings. (ii) We provide general guidelines to new practitioners in the area: describing lessons learned from MDRL works, pointing to recent benchmarks, and outlining open avenues of research. (iii) We take a more critical tone raising practical challenges of MDRL (e.g., implementation and computational demands). We expect this article will help unify and motivate future research to take advantage of the abundant literature that exists (e.g., RL and MAL) in a joint effort to promote fruitful research in the multiagent community.Comment: Under review since Oct 2018. Earlier versions of this work had the title: "Is multiagent deep reinforcement learning the answer or the question? A brief survey

    Suicide attacks in the context of radicalization

    Get PDF
    İntihar eylemleri, masum insanların zarar görmesine ve toplumsal düzenin bozulmasına neden olabilmektedir. Radikal gruplar, genellikle savunmasız kişileri hedef alarak onları ideolojik manipülasyona maruz bırakır ve intihar eylemlerine yönlendirir. Bu eylemler, psikolojik sağlık sorunları, kimlik bunalımları ve duygusal çatışmalar gibi içsel faktörlerden etkilenebilmektedir. İntihar eylemleri, toplumsal, ekonomik ve insan hakları maliyetlerine yol açar. Bu nedenle, radikalleşme bağlamında gerçekleştirilen intihar eylemleri ciddi bir problem olarak kabul edilmektedir. Çalışma, intihar eylemlerinin psikolojik, sosyal ve kültürel faktörlerle ilişkisini incelemekte ve politika yapıcıların ve uygulayıcıların etkili önleme ve müdahale stratejileri geliştirmesine yardımcı olabilecek bilgiler sunmaktadır. Çalışmada nitel araştırma yöntemlerinden eylem araştırması kullanılmış ve döküman analizi yapılmıştır. Çalışmanın amacı, intihar eylemlerinin radikalleşme süreçleriyle nasıl ilişkili olduğunu anlamak ve bu tür eylemleri önlemeye yönelik etkili stratejiler geliştirmek radikalleşme süreçlerinin intihar eylemleriyle nasıl ilişkili olduğunu anlamak ve bu ilişkinin mekanizmalarını ortaya çıkarmaktır. Çalışma, radikal ideolojilere sahip insanların intihar eylemlerinin nasıl meşru gördüğünü ve bu tür eylemlerin toplumlar üzerindeki etkilerini ele almaktadır.Suicide attacks can cause harm to innocent people and disrupt social order. Radical groups often target vulnerable individuals, subjecting them to ideological manipulation and directing them towards suicide attacks. These actions can be influenced by internal factors such as mental health issues, identity crises, and emotional conflicts. Suicide attacks result in social, economic, and human rights costs. Therefore, suicide attacks carried out within the context of radicalization are considered a serious problem. The study investigates the relationship between suicide attacks and psychological, social, and cultural factors, and provides information that can assist policymakers and practitioners in developing effective prevention and intervention strategies. The study utilized qualitative research methods, specifically action research, and document analysis. The aim of the study was to understand the relationship between suicide attacks and processes of radicalization, and to develop effective strategies for preventing such acts. The study sought to comprehend how radicalization processes are associated with suicide attacks and uncover the mechanisms underlying this relationship. The study examines how individuals with radical ideologies perceive suicide attacks as legitimate and the impact of such actions on societie

    Terminal Prediction as an Auxiliary Task for Deep Reinforcement Learning

    Full text link
    Deep reinforcement learning has achieved great successes in recent years, but there are still open challenges, such as convergence to locally optimal policies and sample inefficiency. In this paper, we contribute a novel self-supervised auxiliary task, i.e., Terminal Prediction (TP), estimating temporal closeness to terminal states for episodic tasks. The intuition is to help representation learning by letting the agent predict how close it is to a terminal state, while learning its control policy. Although TP could be integrated with multiple algorithms, this paper focuses on Asynchronous Advantage Actor-Critic (A3C) and demonstrating the advantages of A3C-TP. Our extensive evaluation includes: a set of Atari games, the BipedalWalker domain, and a mini version of the recently proposed multi-agent Pommerman game. Our results on Atari games and the BipedalWalker domain suggest that A3C-TP outperforms standard A3C in most of the tested domains and in others it has similar performance. In Pommerman, our proposed method provides significant improvement both in learning efficiency and converging to better policies against different opponents.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: text overlap with arXiv:1812.0004

    Action Guidance with MCTS for Deep Reinforcement Learning

    Full text link
    Deep reinforcement learning has achieved great successes in recent years, however, one main challenge is the sample inefficiency. In this paper, we focus on how to use action guidance by means of a non-expert demonstrator to improve sample efficiency in a domain with sparse, delayed, and possibly deceptive rewards: the recently-proposed multi-agent benchmark of Pommerman. We propose a new framework where even a non-expert simulated demonstrator, e.g., planning algorithms such as Monte Carlo tree search with a small number rollouts, can be integrated within asynchronous distributed deep reinforcement learning methods. Compared to a vanilla deep RL algorithm, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19). arXiv admin note: substantial text overlap with arXiv:1904.05759, arXiv:1812.0004

    Agent Modeling as Auxiliary Task for Deep Reinforcement Learning

    Full text link
    In this paper we explore how actor-critic methods in deep reinforcement learning, in particular Asynchronous Advantage Actor-Critic (A3C), can be extended with agent modeling. Inspired by recent works on representation learning and multiagent deep reinforcement learning, we propose two architectures to perform agent modeling: the first one based on parameter sharing, and the second one based on agent policy features. Both architectures aim to learn other agents' policies as auxiliary tasks, besides the standard actor (policy) and critic (values). We performed experiments in both cooperative and competitive domains. The former is a problem of coordinated multiagent object transportation and the latter is a two-player mini version of the Pommerman game. Our results show that the proposed architectures stabilize learning and outperform the standard A3C architecture when learning a best response in terms of expected rewards.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE'19

    On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman

    Full text link
    How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL --- past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the environment's complexity. In this paper, we illuminate reasons behind this failure by providing a thorough analysis on the hardness of random exploration in Pommerman. While model-free random exploration is typically futile, we develop a model-based automatic reasoning module that can be used for safer exploration by pruning actions that will surely lead the agent to death. We empirically demonstrate that this module can significantly improve learning.Comment: AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE) 201
    corecore