15,323 research outputs found
Learning with Opponent-Learning Awareness
Multi-agent settings are quickly gathering importance in machine learning.
This includes a plethora of recent work on deep multi-agent reinforcement
learning, but also can be extended to hierarchical RL, generative adversarial
networks and decentralised optimisation. In all these settings the presence of
multiple learning agents renders the training problem non-stationary and often
leads to unstable training or undesired final results. We present Learning with
Opponent-Learning Awareness (LOLA), a method in which each agent shapes the
anticipated learning of the other agents in the environment. The LOLA learning
rule includes a term that accounts for the impact of one agent's policy on the
anticipated parameter update of the other agents. Results show that the
encounter of two LOLA agents leads to the emergence of tit-for-tat and
therefore cooperation in the iterated prisoners' dilemma, while independent
learning does not. In this domain, LOLA also receives higher payouts compared
to a naive learner, and is robust against exploitation by higher order
gradient-based methods. Applied to repeated matching pennies, LOLA agents
converge to the Nash equilibrium. In a round robin tournament we show that LOLA
agents successfully shape the learning of a range of multi-agent learning
algorithms from literature, resulting in the highest average returns on the
IPD. We also show that the LOLA update rule can be efficiently calculated using
an extension of the policy gradient estimator, making the method suitable for
model-free RL. The method thus scales to large parameter and input spaces and
nonlinear function approximators. We apply LOLA to a grid world task with an
embedded social dilemma using recurrent policies and opponent modelling. By
explicitly considering the learning of the other agent, LOLA agents learn to
cooperate out of self-interest. The code is at github.com/alshedivat/lola
An Evolutionary Learning Approach for Adaptive Negotiation Agents
Developing effective and efficient negotiation mechanisms for real-world applications such as e-Business is challenging since negotiations in such a context are characterised by combinatorially complex negotiation spaces, tough deadlines, very limited information about the opponents, and volatile negotiator preferences. Accordingly, practical negotiation systems should be empowered by effective learning mechanisms to acquire dynamic domain knowledge from the possibly changing negotiation contexts. This paper illustrates our adaptive negotiation agents which are underpinned by robust evolutionary learning mechanisms to deal with complex and dynamic negotiation contexts. Our experimental results show that GA-based adaptive negotiation agents outperform a theoretically optimal negotiation mechanism which guarantees Pareto optimal. Our research work opens the door to the development of practical negotiation systems for real-world applications
Human-Agent Decision-making: Combining Theory and Practice
Extensive work has been conducted both in game theory and logic to model
strategic interaction. An important question is whether we can use these
theories to design agents for interacting with people? On the one hand, they
provide a formal design specification for agent strategies. On the other hand,
people do not necessarily adhere to playing in accordance with these
strategies, and their behavior is affected by a multitude of social and
psychological factors. In this paper we will consider the question of whether
strategies implied by theories of strategic behavior can be used by automated
agents that interact proficiently with people. We will focus on automated
agents that we built that need to interact with people in two negotiation
settings: bargaining and deliberation. For bargaining we will study game-theory
based equilibrium agents and for argumentation we will discuss logic-based
argumentation theory. We will also consider security games and persuasion games
and will discuss the benefits of using equilibrium based agents.Comment: In Proceedings TARK 2015, arXiv:1606.0729
Partner Selection for the Emergence of Cooperation in Multi-Agent Systems Using Reinforcement Learning
Social dilemmas have been widely studied to explain how humans are able to
cooperate in society. Considerable effort has been invested in designing
artificial agents for social dilemmas that incorporate explicit agent
motivations that are chosen to favor coordinated or cooperative responses. The
prevalence of this general approach points towards the importance of achieving
an understanding of both an agent's internal design and external environment
dynamics that facilitate cooperative behavior. In this paper, we investigate
how partner selection can promote cooperative behavior between agents who are
trained to maximize a purely selfish objective function. Our experiments reveal
that agents trained with this dynamic learn a strategy that retaliates against
defectors while promoting cooperation with other agents resulting in a
prosocial society.Comment:
- …