Search CORE

15,323 research outputs found

Learning with Opponent-Learning Awareness

Author: Abbeel Pieter
Al-Shedivat Maruan
Chen Richard Y.
Foerster Jakob N.
Mordatch Igor
Whiteson Shimon
Publication venue
Publication date: 15/07/2018
Field of study

Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora of recent work on deep multi-agent reinforcement learning, but also can be extended to hierarchical RL, generative adversarial networks and decentralised optimisation. In all these settings the presence of multiple learning agents renders the training problem non-stationary and often leads to unstable training or undesired final results. We present Learning with Opponent-Learning Awareness (LOLA), a method in which each agent shapes the anticipated learning of the other agents in the environment. The LOLA learning rule includes a term that accounts for the impact of one agent's policy on the anticipated parameter update of the other agents. Results show that the encounter of two LOLA agents leads to the emergence of tit-for-tat and therefore cooperation in the iterated prisoners' dilemma, while independent learning does not. In this domain, LOLA also receives higher payouts compared to a naive learner, and is robust against exploitation by higher order gradient-based methods. Applied to repeated matching pennies, LOLA agents converge to the Nash equilibrium. In a round robin tournament we show that LOLA agents successfully shape the learning of a range of multi-agent learning algorithms from literature, resulting in the highest average returns on the IPD. We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL. The method thus scales to large parameter and input spaces and nonlinear function approximators. We apply LOLA to a grid world task with an embedded social dilemma using recurrent policies and opponent modelling. By explicitly considering the learning of the other agent, LOLA agents learn to cooperate out of self-interest. The code is at github.com/alshedivat/lola

arXiv.org e-Print Archive

Oxford University Research Archive

An Evolutionary Learning Approach for Adaptive Negotiation Agents

Author: Baber
Barbuceanu
Bazerman
Brehmer
Bui
Choi
de Jong
Duda
Ehtamo
Faratin
Faratin
Faratin
Fatima
Fatima
Gerding
Goldberg
Guttman
Harsanyi
He
Heiskanen
Jennings
Jonker
Keeney
Kraus
Kraus
Krause
Krovi
Lau
Lau
Lomuscio
Matos
Matos
Matwin
Nash
Parsons
Pruitt
Raiffa
Rosenschein
Rubenstein-Montano
Sierra
Sim
Sim
Sim
Sim
Sim
Sycara
von Neumann
Wooldridge
Zeng
Zeng
Zeuthen
Zhang
Publication venue: 'Wiley'
Publication date: 01/01/2005
Field of study

Developing effective and efficient negotiation mechanisms for real-world applications such as e-Business is challenging since negotiations in such a context are characterised by combinatorially complex negotiation spaces, tough deadlines, very limited information about the opponents, and volatile negotiator preferences. Accordingly, practical negotiation systems should be empowered by effective learning mechanisms to acquire dynamic domain knowledge from the possibly changing negotiation contexts. This paper illustrates our adaptive negotiation agents which are underpinned by robust evolutionary learning mechanisms to deal with complex and dynamic negotiation contexts. Our experimental results show that GA-based adaptive negotiation agents outperform a theoretically optimal negotiation mechanism which guarantees Pareto optimal. Our research work opens the door to the development of practical negotiation systems for real-world applications

Deakin Research Online

Crossref

Queensland University of Technology ePrints Archive

Human-Agent Decision-making: Combining Theory and Practice

Author: Kraus Sarit
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2016
Field of study

Extensive work has been conducted both in game theory and logic to model strategic interaction. An important question is whether we can use these theories to design agents for interacting with people? On the one hand, they provide a formal design specification for agent strategies. On the other hand, people do not necessarily adhere to playing in accordance with these strategies, and their behavior is affected by a multitude of social and psychological factors. In this paper we will consider the question of whether strategies implied by theories of strategic behavior can be used by automated agents that interact proficiently with people. We will focus on automated agents that we built that need to interact with people in two negotiation settings: bargaining and deliberation. For bargaining we will study game-theory based equilibrium agents and for argumentation we will discuss logic-based argumentation theory. We will also consider security games and persuasion games and will discuss the benefits of using equilibrium based agents.Comment: In Proceedings TARK 2015, arXiv:1606.0729

arXiv.org e-Print Archive

Directory of Open Access Journals

An Economists Perspective on Multi-Agent Learning

Author: David K Levine
Drew Fudenberg
Publication venue
Publication date
Field of study

Research Papers in Economics

Partner Selection for the Emergence of Cooperation in Multi-Agent Systems Using Reinforcement Learning

Author: Anastassacos Nicolas
Hailes Stephen
Musolesi Mirco
Publication venue
Publication date: 28/11/2019
Field of study

Social dilemmas have been widely studied to explain how humans are able to cooperate in society. Considerable effort has been invested in designing artificial agents for social dilemmas that incorporate explicit agent motivations that are chosen to favor coordinated or cooperative responses. The prevalence of this general approach points towards the importance of achieving an understanding of both an agent's internal design and external environment dynamics that facilitate cooperative behavior. In this paper, we investigate how partner selection can promote cooperative behavior between agents who are trained to maximize a purely selfish objective function. Our experiments reveal that agents trained with this dynamic learn a strategy that retaliates against defectors while promoting cooperation with other agents resulting in a prosocial society.Comment:

arXiv.org e-Print Archive

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Association for the Advancement of Artificial Intelligence: AAAI Publications