11,029 research outputs found
Probabilistic inverse reinforcement learning in unknown environments
We consider the problem of learning by demonstration from agents acting in
unknown stochastic Markov environments or games. Our aim is to estimate agent
preferences in order to construct improved policies for the same task that the
agents are trying to solve. To do so, we extend previous probabilistic
approaches for inverse reinforcement learning in known MDPs to the case of
unknown dynamics or opponents. We do this by deriving two simplified
probabilistic models of the demonstrator's policy and utility. For
tractability, we use maximum a posteriori estimation rather than full Bayesian
inference. Under a flat prior, this results in a convex optimisation problem.
We find that the resulting algorithms are highly competitive against a variety
of other methods for inverse reinforcement learning that do have knowledge of
the dynamics.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
Playing Atari with Deep Reinforcement Learning
We present the first deep learning model to successfully learn control
policies directly from high-dimensional sensory input using reinforcement
learning. The model is a convolutional neural network, trained with a variant
of Q-learning, whose input is raw pixels and whose output is a value function
estimating future rewards. We apply our method to seven Atari 2600 games from
the Arcade Learning Environment, with no adjustment of the architecture or
learning algorithm. We find that it outperforms all previous approaches on six
of the games and surpasses a human expert on three of them.Comment: NIPS Deep Learning Workshop 201
Evolutionary games on graphs
Game theory is one of the key paradigms behind many scientific disciplines
from biology to behavioral sciences to economics. In its evolutionary form and
especially when the interacting agents are linked in a specific social network
the underlying solution concepts and methods are very similar to those applied
in non-equilibrium statistical physics. This review gives a tutorial-type
overview of the field for physicists. The first three sections introduce the
necessary background in classical and evolutionary game theory from the basic
definitions to the most important results. The fourth section surveys the
topological complications implied by non-mean-field-type social network
structures in general. The last three sections discuss in detail the dynamic
behavior of three prominent classes of models: the Prisoner's Dilemma, the
Rock-Scissors-Paper game, and Competing Associations. The major theme of the
review is in what sense and how the graph structure of interactions can modify
and enrich the picture of long term behavioral patterns emerging in
evolutionary games.Comment: Review, final version, 133 pages, 65 figure
Hide and Seek in Arizona
Laboratory subjects repeatedly played one of two variations of a simple two-person zero-sum game of ``hide and seek.'' Three puzzling departures from the prescriptions of equilibrium theory are found in the data: an asymmetry related to the player's role in the game; an asymmetry across the game variations; and positive serial correlation in subjects' play. Possible explanations for these departures are considered.Minimax, mixed strategy, experiment
Dynamics in atomic signaling games
We study an atomic signaling game under stochastic evolutionary dynamics.
There is a finite number of players who repeatedly update from a finite number
of available languages/signaling strategies. Players imitate the most fit
agents with high probability or mutate with low probability. We analyze the
long-run distribution of states and show that, for sufficiently small mutation
probability, its support is limited to efficient communication systems. We find
that this behavior is insensitive to the particular choice of evolutionary
dynamic, a property that is due to the game having a potential structure with a
potential function corresponding to average fitness. Consequently, the model
supports conclusions similar to those found in the literature on language
competition. That is, we show that efficient languages eventually predominate
the society while reproducing the empirical phenomenon of linguistic drift. The
emergence of efficiency in the atomic case can be contrasted with results for
non-atomic signaling games that establish the non-negligible possibility of
convergence, under replicator dynamics, to states of unbounded efficiency loss
- …