Search CORE

37,725 research outputs found

Definable Zero-Sum Stochastic Games

Author: Bolte Jérôme
Gaubert Stéphane
Vigeral Guillaume
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 01/01/2015
Field of study

International audienceDefinable zero-sum stochastic games involve a finite number of states and action sets, reward and transition functions that are definable in an o-minimal structure. Prominent examples of such games are finite, semi-algebraic or globally subanalytic stochastic games. We prove that the Shapley operator of any definable stochastic game with separable transition and reward functions is definable in the same structure. Definability in the same structure does not hold systematically: we provide a counterexample of a stochastic game with semi-algebraic data yielding a non semi-algebraic but globally subanalytic Shapley operator. %Showing the definability of the Shapley operator in full generality appears thus as a complex and challenging issue. } Our definability results on Shapley operators are used to prove that any separable definable game has a uniform value; in the case of polynomially bounded structures we also provide convergence rates. Using an approximation procedure, we actually establish that general zero-sum games with separable definable transition functions have a uniform value. These results highlight the key role played by the tame structure of transition functions. As particular cases of our main results, we obtain that stochastic games with polynomial transitions, definable games with finite actions on one side, definable games with perfect information or switching controls have a uniform value. Applications to nonlinear maps arising in risk sensitive control and Perron-Frobenius theory are also given

Base de publications de l'université Paris-Dauphine

INRIA a CCSD electronic archive server

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

HAL-Polytechnique

A class of nonzero-sum investment and reinsurance games subject to systematic risks

Author: Siu CC
Yam SCP
Yang H
Zhao H
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

© 2016 Informa UK Limited, trading as Taylor & Francis Group. Recently, there have been numerous insightful applications of zero-sum stochastic differential games in insurance, as discussed in Liu et al. [Liu, J., Yiu, C. K.-F. & Siu, T. K. (2014). Optimal investment of an insurer with regime-switching and risk constraint. Scandinavian Actuarial Journal 2014(7), 583–601]. While there could be some practical situations under which nonzero-sum game approach is more appropriate, the development of such approach within actuarial contexts remains rare in the existing literature. In this article, we study a class of nonzero-sum reinsurance-investment stochastic differential games between two competitive insurers subject to systematic risks described by a general compound Poisson risk model. Each insurer can purchase the excess-of-loss reinsurance to mitigate both systematic and idiosyncratic jump risks of the inter-arrival claims; and can invest in one risk-free asset and one risky asset whose price dynamics follows the famous Heston stochastic volatility model [Heston, S. L. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency options. Review of Financial Studies6, 327–343]. The main objective of each insurer is to maximize the expected utility of his terminal surplus relative to that of his competitor. Dynamic programming principle for this class of nonzero-sum game problems leads to a non-canonical fixed-point problem of coupled non-linear integral-typed equations. Despite the complex structure, we establish the unique existence of the Nash equilibrium reinsurance-investment strategies and the corresponding value functions of the insurers in a representative example of the constant absolute risk aversion insurers under a mild, time-independent condition. Furthermore, Nash equilibrium strategies and value functions admit closed forms. Numerical studies are also provided to illustrate the impact of the systematic risks on the Nash equilibrium strategies. Finally, we connect our results to that under the diffusion-approximated model by proving explicitly that the Nash equilibrium under the diffusion-approximated model is an (Formula presented.) -Nash equilibrium under the general Poisson risk model, thereby establishing that the analogous Nash equilibrium in Bensoussan et al. [Bensoussan, A., Siu, C. C., Yam, S. C. P. & Yang, H. (2014). A class of nonzero-sum stochastic differential investment and reinsurance games. Automatica50(8), 2025–2037] serves as an interesting complementary case of the present framework

OPUS - University of Technology Sydney

HKU Scholars Hub

Approximating the Termination Value of One-Counter MDPs and Stochastic Games

Author: G.R. Grimmett
J. Lambert
K. Etessami
K. Etessami
L.B. White
M.L. Puterman
T. Brázdil
Publication venue
Publication date: 01/01/2011
Field of study

One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs) are 1-player, and 2-player turn-based zero-sum, stochastic games played on the transition graph of classic one-counter automata (equivalently, pushdown automata with a 1-letter stack alphabet). A key objective for the analysis and verification of these games is the termination objective, where the players aim to maximize (minimize, respectively) the probability of hitting counter value 0, starting at a given control state and given counter value. Recently, we studied qualitative decision problems ("is the optimal termination value = 1?") for OC-MDPs (and OC-SSGs) and showed them to be decidable in P-time (in NP and coNP, respectively). However, quantitative decision and approximation problems ("is the optimal termination value ? p", or "approximate the termination value within epsilon") are far more challenging. This is so in part because optimal strategies may not exist, and because even when they do exist they can have a highly non-trivial structure. It thus remained open even whether any of these quantitative termination problems are computable. In this paper we show that all quantitative approximation problems for the termination value for OC-MDPs and OC-SSGs are computable. Specifically, given a OC-SSG, and given epsilon > 0, we can compute a value v that approximates the value of the OC-SSG termination game within additive error epsilon, and furthermore we can compute epsilon-optimal strategies for both players in the game. A key ingredient in our proofs is a subtle martingale, derived from solving certain LPs that we can associate with a maximizing OC-MDP. An application of Azuma's inequality on these martingales yields a computable bound for the "wealth" at which a "rich person's strategy" becomes epsilon-optimal for OC-MDPs.Comment: 35 pages, 1 figure, full version of a paper presented at ICALP 2011, invited for submission to Information and Computatio

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information

Author: Oliehoek Frans A.
Roijers Diederik M.
Wiggers Auke J.
Publication venue
Publication date: 01/01/2016
Field of study

Zero-sum stochastic games provide a rich model for competitive decision making. However, under general forms of state uncertainty as considered in the Partially Observable Stochastic Game (POSG), such decision making problems are still not very well understood. This paper makes a contribution to the theory of zero-sum POSGs by characterizing structure in their value function. In particular, we introduce a new formulation of the value function for zs-POSGs as a function of the "plan-time sufficient statistics" (roughly speaking the information distribution in the POSG), which has the potential to enable generalization over such information distributions. We further delineate this generalization capability by proving a structural result on the shape of value function: it exhibits concavity and convexity with respect to appropriately chosen marginals of the statistic space. This result is a key pre-cursor for developing solution methods that may be able to exploit such structure. Finally, we show how these results allow us to reduce a zs-POSG to a "centralized" model with shared observations, thereby transferring results for the latter, narrower class, to games with individual (private) observations

arXiv.org e-Print Archive

University of Liverpool Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Equilibria-based Probabilistic Model Checking for Concurrent Stochastic Games

Author: A Bianco
A Toumi
C Dehnert
C Lemke
D Fernando
D Lozovanu
E Kelmendi
H Hansson
J Gutierrez
J Kemeny
J Pacheco
J von Neumann
K Chatterjee
K Chatterjee
K Chatterjee
K Chatterjee
L de Alfaro
L de Alfaro
L de Alfaro
L de Moura
L Shapley
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
M Kwiatkowska
M Osborne
N Basset
N Nisan
P Čermák
R Alur
R Brenguier
S Haddad
T Chen
T Chen
U Schwalbe
Publication venue
Publication date: 01/01/2019
Field of study

Probabilistic model checking for stochastic games enables formal verification of systems that comprise competing or collaborating entities operating in a stochastic environment. Despite good progress in the area, existing approaches focus on zero-sum goals and cannot reason about scenarios where entities are endowed with different objectives. In this paper, we propose probabilistic model checking techniques for concurrent stochastic games based on Nash equilibria. We extend the temporal logic rPATL (probabilistic alternating-time temporal logic with rewards) to allow reasoning about players with distinct quantitative goals, which capture either the probability of an event occurring or a reward measure. We present algorithms to synthesise strategies that are subgame perfect social welfare optimal Nash equilibria, i.e., where there is no incentive for any players to unilaterally change their strategy in any state of the game, whilst the combined probabilities or rewards are maximised. We implement our techniques in the PRISM-games tool and apply them to several case studies, including network protocols and robot navigation, showing the benefits compared to existing approaches

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Oxford University Research Archive

Enlighten