Search CORE

7,337 research outputs found

Penalty-regulated dynamics and robust learning procedures in games

Author: Coucheney Pierre
Gaujal Bruno
Mertikopoulos Panayotis
Publication venue
Publication date: 06/04/2014
Field of study

Starting from a heuristic learning scheme for N-person games, we derive a new class of continuous-time learning dynamics consisting of a replicator-like drift adjusted by a penalty term that renders the boundary of the game's strategy space repelling. These penalty-regulated dynamics are equivalent to players keeping an exponentially discounted aggregate of their on-going payoffs and then using a smooth best response to pick an action based on these performance scores. Owing to this inherent duality, the proposed dynamics satisfy a variant of the folk theorem of evolutionary game theory and they converge to (arbitrarily precise) approximations of Nash equilibria in potential games. Motivated by applications to traffic engineering, we exploit this duality further to design a discrete-time, payoff-based learning algorithm which retains these convergence properties and only requires players to observe their in-game payoffs: moreover, the algorithm remains robust in the presence of stochastic perturbations and observation errors, and it does not require any synchronization between players.Comment: 33 pages, 3 figure

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL UVSQ

Distributed stochastic optimization via matrix exponential learning

Author: Belmega E. Veronica
Mertikopoulos Panayotis
Negrel Romain
Sanguinetti Luca
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In this paper, we investigate a distributed learning scheme for a broad class of stochastic optimization problems and games that arise in signal processing and wireless communications. The proposed algorithm relies on the method of matrix exponential learning (MXL) and only requires locally computable gradient observations that are possibly imperfect and/or obsolete. To analyze it, we introduce the notion of a stable Nash equilibrium and we show that the algorithm is globally convergent to such equilibria - or locally convergent when an equilibrium is only locally stable. We also derive an explicit linear bound for the algorithm's convergence speed, which remains valid under measurement errors and uncertainty of arbitrarily high variance. To validate our theoretical analysis, we test the algorithm in realistic multi-carrier/multiple-antenna wireless scenarios where several users seek to maximize their energy efficiency. Our results show that learning allows users to attain a net increase between 100% and 500% in energy efficiency, even under very high uncertainty.Comment: 31 pages, 3 figure

arXiv.org e-Print Archive

HAL - Normandie Université

HAL-CentraleSupelec

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Archivio della Ricerca - Università di Pisa

HAL-Rennes 1

Game-theoretical control with continuous action sets

Author: Leslie David S.
Mertikopoulos Panayotis
Perkins Steven
Publication venue
Publication date: 01/01/2014
Field of study

Motivated by the recent applications of game-theoretical learning techniques to the design of distributed control systems, we study a class of control problems that can be formulated as potential games with continuous action sets, and we propose an actor-critic reinforcement learning algorithm that provably converges to equilibrium in this class of problems. The method employed is to analyse the learning process under study through a mean-field dynamical system that evolves in an infinite-dimensional function space (the space of probability distributions over the players' continuous controls). To do so, we extend the theory of finite-dimensional two-timescale stochastic approximation to an infinite-dimensional, Banach space setting, and we prove that the continuous dynamics of the process converge to equilibrium in the case of potential games. These results combine to give a provably-convergent learning algorithm in which players do not need to keep track of the controls selected by the other agents.Comment: 19 page

arXiv.org e-Print Archive

Lancaster E-Prints

Inertial game dynamics and applications to constrained optimization

Author: Laraki Rida
Mertikopoulos Panayotis
Publication venue
Publication date: 01/01/2015
Field of study

Aiming to provide a new class of game dynamics with good long-term rationality properties, we derive a second-order inertial system that builds on the widely studied "heavy ball with friction" optimization method. By exploiting a well-known link between the replicator dynamics and the Shahshahani geometry on the space of mixed strategies, the dynamics are stated in a Riemannian geometric framework where trajectories are accelerated by the players' unilateral payoff gradients and they slow down near Nash equilibria. Surprisingly (and in stark contrast to another second-order variant of the replicator dynamics), the inertial replicator dynamics are not well-posed; on the other hand, it is possible to obtain a well-posed system by endowing the mixed strategy space with a different Hessian-Riemannian (HR) metric structure, and we characterize those HR geometries that do so. In the single-agent version of the dynamics (corresponding to constrained optimization over simplex-like objects), we show that regular maximum points of smooth functions attract all nearby solution orbits with low initial speed. More generally, we establish an inertial variant of the so-called "folk theorem" of evolutionary game theory and we show that strict equilibria are attracting in asymmetric (multi-population) games - provided of course that the dynamics are well-posed. A similar asymptotic stability result is obtained for evolutionarily stable strategies in symmetric (single- population) games.Comment: 30 pages, 4 figures; significantly revised paper structure and added new material on Euclidean embeddings and evolutionarily stable strategie

arXiv.org e-Print Archive

CiteSeerX

University of Liverpool Repository

Hal - Université Grenoble Alpes

Riemannian game dynamics

Author: Mertikopoulos Panayotis
Sandholm WIlliam H.
Publication venue
Publication date: 11/04/2018
Field of study

We study a class of evolutionary game dynamics defined by balancing a gain determined by the game's payoffs against a cost of motion that captures the difficulty with which the population moves between states. Costs of motion are represented by a Riemannian metric, i.e., a state-dependent inner product on the set of population states. The replicator dynamics and the (Euclidean) projection dynamics are the archetypal examples of the class we study. Like these representative dynamics, all Riemannian game dynamics satisfy certain basic desiderata, including positive correlation and global convergence in potential games. Moreover, when the underlying Riemannian metric satisfies a Hessian integrability condition, the resulting dynamics preserve many further properties of the replicator and projection dynamics. We examine the close connections between Hessian game dynamics and reinforcement learning in normal form games, extending and elucidating a well-known link between the replicator dynamics and exponential reinforcement learning.Comment: 47 pages, 12 figures; added figures and further simplified the derivation of the dynamic

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Was It Something I Ate? Implementation of the FDA Seafood HACCP Program

Author: Anna Alberini
Dominic Mancini
Erik Lichtenberg
Gregmar I. Galinato
Publication venue
Publication date
Field of study

We use FDA’s seafood inspection records to examine: (i) how FDA has targeted its inspections under HACCP regulation; (ii) the effects of FDA inspections on compliance with both HACCP and plant sanitation standards; and (iii) the relationship between HACCP regulations and pre-existing sanitation standards. We use a theoretical model of enforcement to derive hypotheses about FDA’s targeting of inspections and firms’ patterns of compliance. We test those hypotheses using econometric models of inspection and compliance. Contrary to the predictions of the theoretical model and to FDA’s own stated policies, FDA does not seem to have targeted inspections based on product risk or past compliance performance. Firms’ compliance strategies seemed to be broadly in accord with the predictions of the theoretical model. The threat of inspection increased the likelihood of compliance, although the deterrent effect was statistically significant for sanitation standards but not for HACCP. Firms tend to persist in compliance status, especially with respect to sanitation standards. Contrary to FDA’s presupposition, however, HACCP compliance does not improve compliance with sanitation standards, suggesting that the two are not complementary.HACCP, Food safety, Seafood, Enforcement, Regulatory compliance, Regulation

Research Papers in Economics