3,093 research outputs found
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
Many real-world applications can be described as large-scale games of
imperfect information. To deal with these challenging domains, prior work has
focused on computing Nash equilibria in a handcrafted abstraction of the
domain. In this paper we introduce the first scalable end-to-end approach to
learning approximate Nash equilibria without prior domain knowledge. Our method
combines fictitious self-play with deep reinforcement learning. When applied to
Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium,
whereas common reinforcement learning methods diverged. In Limit Texas Holdem,
a poker game of real-world scale, NFSP learnt a strategy that approached the
performance of state-of-the-art, superhuman algorithms based on significant
domain expertise.Comment: updated version, incorporating conference feedbac
Human-Agent Decision-making: Combining Theory and Practice
Extensive work has been conducted both in game theory and logic to model
strategic interaction. An important question is whether we can use these
theories to design agents for interacting with people? On the one hand, they
provide a formal design specification for agent strategies. On the other hand,
people do not necessarily adhere to playing in accordance with these
strategies, and their behavior is affected by a multitude of social and
psychological factors. In this paper we will consider the question of whether
strategies implied by theories of strategic behavior can be used by automated
agents that interact proficiently with people. We will focus on automated
agents that we built that need to interact with people in two negotiation
settings: bargaining and deliberation. For bargaining we will study game-theory
based equilibrium agents and for argumentation we will discuss logic-based
argumentation theory. We will also consider security games and persuasion games
and will discuss the benefits of using equilibrium based agents.Comment: In Proceedings TARK 2015, arXiv:1606.0729
- …