37,725 research outputs found
Definable Zero-Sum Stochastic Games
International audienceDefinable zero-sum stochastic games involve a finite number of states and action sets, reward and transition functions that are definable in an o-minimal structure. Prominent examples of such games are finite, semi-algebraic or globally subanalytic stochastic games. We prove that the Shapley operator of any definable stochastic game with separable transition and reward functions is definable in the same structure. Definability in the same structure does not hold systematically: we provide a counterexample of a stochastic game with semi-algebraic data yielding a non semi-algebraic but globally subanalytic Shapley operator. %Showing the definability of the Shapley operator in full generality appears thus as a complex and challenging issue. } Our definability results on Shapley operators are used to prove that any separable definable game has a uniform value; in the case of polynomially bounded structures we also provide convergence rates. Using an approximation procedure, we actually establish that general zero-sum games with separable definable transition functions have a uniform value. These results highlight the key role played by the tame structure of transition functions. As particular cases of our main results, we obtain that stochastic games with polynomial transitions, definable games with finite actions on one side, definable games with perfect information or switching controls have a uniform value. Applications to nonlinear maps arising in risk sensitive control and Perron-Frobenius theory are also given
A class of nonzero-sum investment and reinsurance games subject to systematic risks
© 2016 Informa UK Limited, trading as Taylor & Francis Group. Recently, there have been numerous insightful applications of zero-sum stochastic differential games in insurance, as discussed in Liu et al. [Liu, J., Yiu, C. K.-F. & Siu, T. K. (2014). Optimal investment of an insurer with regime-switching and risk constraint. Scandinavian Actuarial Journal 2014(7), 583–601]. While there could be some practical situations under which nonzero-sum game approach is more appropriate, the development of such approach within actuarial contexts remains rare in the existing literature. In this article, we study a class of nonzero-sum reinsurance-investment stochastic differential games between two competitive insurers subject to systematic risks described by a general compound Poisson risk model. Each insurer can purchase the excess-of-loss reinsurance to mitigate both systematic and idiosyncratic jump risks of the inter-arrival claims; and can invest in one risk-free asset and one risky asset whose price dynamics follows the famous Heston stochastic volatility model [Heston, S. L. (1993). A closed-form solution for options with stochastic volatility with applications to bond and currency options. Review of Financial Studies6, 327–343]. The main objective of each insurer is to maximize the expected utility of his terminal surplus relative to that of his competitor. Dynamic programming principle for this class of nonzero-sum game problems leads to a non-canonical fixed-point problem of coupled non-linear integral-typed equations. Despite the complex structure, we establish the unique existence of the Nash equilibrium reinsurance-investment strategies and the corresponding value functions of the insurers in a representative example of the constant absolute risk aversion insurers under a mild, time-independent condition. Furthermore, Nash equilibrium strategies and value functions admit closed forms. Numerical studies are also provided to illustrate the impact of the systematic risks on the Nash equilibrium strategies. Finally, we connect our results to that under the diffusion-approximated model by proving explicitly that the Nash equilibrium under the diffusion-approximated model is an (Formula presented.) -Nash equilibrium under the general Poisson risk model, thereby establishing that the analogous Nash equilibrium in Bensoussan et al. [Bensoussan, A., Siu, C. C., Yam, S. C. P. & Yang, H. (2014). A class of nonzero-sum stochastic differential investment and reinsurance games. Automatica50(8), 2025–2037] serves as an interesting complementary case of the present framework
Approximating the Termination Value of One-Counter MDPs and Stochastic Games
One-counter MDPs (OC-MDPs) and one-counter simple stochastic games (OC-SSGs)
are 1-player, and 2-player turn-based zero-sum, stochastic games played on the
transition graph of classic one-counter automata (equivalently, pushdown
automata with a 1-letter stack alphabet). A key objective for the analysis and
verification of these games is the termination objective, where the players aim
to maximize (minimize, respectively) the probability of hitting counter value
0, starting at a given control state and given counter value. Recently, we
studied qualitative decision problems ("is the optimal termination value = 1?")
for OC-MDPs (and OC-SSGs) and showed them to be decidable in P-time (in NP and
coNP, respectively). However, quantitative decision and approximation problems
("is the optimal termination value ? p", or "approximate the termination value
within epsilon") are far more challenging. This is so in part because optimal
strategies may not exist, and because even when they do exist they can have a
highly non-trivial structure. It thus remained open even whether any of these
quantitative termination problems are computable. In this paper we show that
all quantitative approximation problems for the termination value for OC-MDPs
and OC-SSGs are computable. Specifically, given a OC-SSG, and given epsilon >
0, we can compute a value v that approximates the value of the OC-SSG
termination game within additive error epsilon, and furthermore we can compute
epsilon-optimal strategies for both players in the game. A key ingredient in
our proofs is a subtle martingale, derived from solving certain LPs that we can
associate with a maximizing OC-MDP. An application of Azuma's inequality on
these martingales yields a computable bound for the "wealth" at which a "rich
person's strategy" becomes epsilon-optimal for OC-MDPs.Comment: 35 pages, 1 figure, full version of a paper presented at ICALP 2011,
invited for submission to Information and Computatio
Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information
Zero-sum stochastic games provide a rich model for competitive decision
making. However, under general forms of state uncertainty as considered in the
Partially Observable Stochastic Game (POSG), such decision making problems are
still not very well understood. This paper makes a contribution to the theory
of zero-sum POSGs by characterizing structure in their value function. In
particular, we introduce a new formulation of the value function for zs-POSGs
as a function of the "plan-time sufficient statistics" (roughly speaking the
information distribution in the POSG), which has the potential to enable
generalization over such information distributions. We further delineate this
generalization capability by proving a structural result on the shape of value
function: it exhibits concavity and convexity with respect to appropriately
chosen marginals of the statistic space. This result is a key pre-cursor for
developing solution methods that may be able to exploit such structure.
Finally, we show how these results allow us to reduce a zs-POSG to a
"centralized" model with shared observations, thereby transferring results for
the latter, narrower class, to games with individual (private) observations
Equilibria-based Probabilistic Model Checking for Concurrent Stochastic Games
Probabilistic model checking for stochastic games enables formal verification
of systems that comprise competing or collaborating entities operating in a
stochastic environment. Despite good progress in the area, existing approaches
focus on zero-sum goals and cannot reason about scenarios where entities are
endowed with different objectives. In this paper, we propose probabilistic
model checking techniques for concurrent stochastic games based on Nash
equilibria. We extend the temporal logic rPATL (probabilistic alternating-time
temporal logic with rewards) to allow reasoning about players with distinct
quantitative goals, which capture either the probability of an event occurring
or a reward measure. We present algorithms to synthesise strategies that are
subgame perfect social welfare optimal Nash equilibria, i.e., where there is no
incentive for any players to unilaterally change their strategy in any state of
the game, whilst the combined probabilities or rewards are maximised. We
implement our techniques in the PRISM-games tool and apply them to several case
studies, including network protocols and robot navigation, showing the benefits
compared to existing approaches
- …