Search CORE

5 research outputs found

The asymptotic value in finite stochastic games

Author: Oliu-Barton Miquel
Publication venue
Publication date: 19/12/2012
Field of study

We provide a direct, elementary proof for the existence of

\lim_{\lambda\to 0} v_{\lambda}

, where

v_{\lambda}

is the value of a

\lambda

-discounted finite two-person zero-sum stochastic game

arXiv.org e-Print Archive

General limit value in zero-sum stochastic games

Author: Ziliotto Bruno
Publication venue
Publication date: 11/11/2015
Field of study

Bewley and Kohlberg (1976) and Mertens and Neyman (1981) have proved, respectively, the existence of the asymptotic value and the uniform value in zero-sum stochastic games with finite state space and finite action sets. In their work, the total payoff in a stochastic game is defined either as a Cesaro mean or an Abel mean of the stage payoffs. This paper presents two findings: first, we generalize the result of Bewley and Kohlberg to a more general class of payoff evaluations and we prove with a counterexample that this result is tight. We also investigate the particular case of absorbing games. Second, for the uniform approach of Mertens and Neyman, we provide another counterexample to demonstrate that there is no natural way to generalize the result of Mertens and Neyman to a wider class of payoff evaluations

arXiv.org e-Print Archive

Best-response Dynamics in Zero-sum Stochastic Games

Author: Leslie David
Perkins Steven
Xu Zibo
Publication venue: 'Elsevier BV'
Publication date: 01/09/2020
Field of study

We define and analyse three learning dynamics for two-player zero-sum discounted-payoff stochastic games. A continuous-time best-response dynamic in mixed strategies is proved to converge to the set of Nash equilibrium stationary strategies. Extending this, we introduce a fictitious-play-like process in a continuous-time embedding of a stochastic zero-sum game, which is again shown to converge to the set of Nash equilibrium strategies. Finally, we present a modified δ-converging best-response dynamic, in which the discount rate converges to 1, and the learned value converges to the asymptotic value of the zero-sum stochastic game. The critical feature of all the dynamic processes is a separation of adaption rates: beliefs about the value of states adapt more slowly than the strategies adapt, and in the case of the δ-converging dynamic the discount rate adapts more slowly than everything else

Lancaster E-Prints

Advances in Zero-Sum Dynamic Games

Author: Agarwal
As Soulaimani
Assoulamani
Aumann
Bardi
Bardi
Bewley
Bewley
Bich
Blackwell
Blackwell
Blackwell
Bolte
Brézis
Buckdahn
Buckdahn
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Cardaliaguet
Coulomb
Coulomb
Coulomb
Coulomb
Coulomb
Coulomb
Crandall
De Meyer
De Meyer
De Meyer
De Meyer
De Meyer
De Meyer
De Meyer
Dynkin
E.
Evans
Everett
Forges
Friedman
Gale
Gensbittel
Gensbittel
Gensbittel
Gensbittel
Gensbittel
Grün
Grün
Guo
Guo
Heuer
Heuer
Hörner
Kohlberg
Kohlberg
Kohlberg
Kohn
Krasovskii
Krausz
Laraki
Laraki
Laraki
Laraki
Laraki
Laraki
Laraki
Laraki
Lehrer
Lehrer
Lehrer
Lepeltier
Levy
Maitra
Maitra
Maitra
Maitra
Maitra
Martin
Martin
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Mertens
Milman
Monderer
Neveu
Neyman
Neyman
Neyman
Neyman
Neyman
Neyman
Neyman
Neyman
Oliu-Barton
Perchet
Perchet
Perchet
Perchet
Perchet
Quincampoix
Radzik
Renault
Renault
Renault
Renault
Rosenberg
Rosenberg
Rosenberg
Rosenberg
Rosenberg
Rosenberg
Rosenberg
Rosenberg
Rosenberg
Scarf
Shapley
Shmaya
Shmaya
Solan
Solan
Solan
Sorin
Sorin
Sorin
Sorin
Sorin
Sorin
Sorin
Sorin
Sorin
Sorin
Souganidis
Souganidis
Souquieré
Spinat
van den Dries
Vieille
Vieille
Vigeral
Vigeral
Zachrisson
Zamir
Zamir
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

International audienceThe survey presents recent results in the theory of two-person zero-sum repeated games and their connections with differential and continuous-time games. The emphasis is made on the following(1) A general model allows to deal simultaneously with stochastic and informational aspects.(2) All evaluations of the stage payoffs can be covered in the same framework (and not only the usual Cesàro and Abel means).(3) The model in discrete time can be seen and analyzed as a discretization of a continuous time game. Moreover, tools and ideas from repeated games are very fruitful for continuous time games and vice versa.(4) Numerous important conjectures have been answered (some in the negative).(5) New tools and original models have been proposed. As a consequence, the field (discrete versus continuous time, stochastic versus incomplete information models) has a much more unified structure, and research is extremely active

University of Liverpool Repository

Base de publications de l'université Paris-Dauphine

Crossref

Hal-Diderot

HAL-Polytechnique