Search CORE

10,486 research outputs found

Benchmarking Hybrid Algorithms for Distributed Constraint Optimisation Games

Author: Chapman Archie
Jennings N. R.
Rogers Alex
Publication venue
Publication date: 01/05/2008
Field of study

Game-theoretical control with continuous action sets

Author: Leslie David S.
Mertikopoulos Panayotis
Perkins Steven
Publication venue
Publication date: 01/01/2014
Field of study

Motivated by the recent applications of game-theoretical learning techniques to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets, and we propose an actor-critic reinforcement learning algorithm that provably converges to equilibrium in this class of problems. The method employed is to analyse the learning process under study through a mean-field dynamical system that evolves in an infinite-dimensional function space (the space of probability distributions over the players' continuous controls). To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provably-convergent learning algorithm in which players do not need to keep track of the controls selected by the other agents.Comment: 19 page

arXiv.org e-Print Archive

Lancaster E-Prints

Learning Equilibria with Partial Information in Decentralized Wireless Networks

Author: Debbah Mérouane
Lasaulce Samson
Perlaza Samir M.
Rose Luca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/06/2011
Field of study

In this article, a survey of several important equilibrium concepts for decentralized networks is presented. The term decentralized is used here to refer to scenarios where decisions (e.g., choosing a power allocation policy) are taken autonomously by devices interacting with each other (e.g., through mutual interference). The iterative long-term interaction is characterized by stable points of the wireless network called equilibria. The interest in these equilibria stems from the relevance of network stability and the fact that they can be achieved by letting radio devices to repeatedly interact over time. To achieve these equilibria, several learning techniques, namely, the best response dynamics, fictitious play, smoothed fictitious play, reinforcement learning algorithms, and regret matching, are discussed in terms of information requirements and convergence properties. Most of the notions introduced here, for both equilibria and learning schemes, are illustrated by a simple case study, namely, an interference channel with two transmitter-receiver pairs.Comment: 16 pages, 5 figures, 1 table. To appear in IEEE Communication Magazine, special Issue on Game Theor

arXiv.org e-Print Archive

Two More Classes of Games with the Fictitious Play Property

Author: Ulrich Berger
Publication venue
Publication date
Field of study

Fictitious play is the oldest and most studied learning process for games. Since the already classical result for zero-sum games, convergence of beliefs to the set of Nash equilibria has been established for some important classes of games, including weighted potential games, supermodular games with diminishing returns, and 3x3 supermodular games. Extending these results, we establish convergence for ordinal potential games and quasi-supermodular games with diminishing returns. As a by-product we obtain convergence for 3xm and 4x4 quasi-supermodular games.Fictitious Play, Learning Process, Ordinal Potential Games, Quasi-Supermodular Games

Research Papers in Economics

Reinforcement learning with restrictions on the action set

Author: Bravo Mario
Faure Mathieu
Publication venue
Publication date: 12/06/2013
Field of study

Consider a 2-player normal-form game repeated over time. We introduce an adaptive learning procedure, where the players only observe their own realized payoff at each stage. We assume that agents do not know their own payoff function, and have no information on the other player. Furthermore, we assume that they have restrictions on their own action set such that, at each stage, their choice is limited to a subset of their action set. We prove that the empirical distributions of play converge to the set of Nash equilibria for zero-sum and potential games, and games where one player has two actions.Comment: 28 page

arXiv.org e-Print Archive

HAL AMU

HAL Descartes