19 research outputs found
An Accretive Operator Approach to Ergodic Problems for Zero-Sum Games
Mean payoff stochastic games can be studied by means of a nonlinear spectral
problem involving the Shapley operator: the ergodic equation. A solution
consists in a scalar, called the ergodic constant, and a vector, called bias.
The existence of such a pair entails that the mean payoff per time unit is
equal to the ergodic constant for any initial state, and the bias gives
stationary strategies. By exploiting two fundamental properties of Shapley
operators, monotonicity and additive homogeneity, we give a necessary and
sufficient condition for the solvability of the ergodic equation for all the
Shapley operators obtained by perturbation of the transition payments of a
given stochastic game with finite state space. If the latter condition is
satisfied, we establish that the bias is unique (up to an additive constant)
for a generic perturbation of the transition payments. To show these results,
we use the theory of accretive operators, and prove in particular some
surjectivity condition.Comment: 4 pages, 1 figure, to appear in Proc. 22nd International Symposium on
Mathematical Theory of Networks and Systems (MTNS 2016
Hypergraph conditions for the solvability of the ergodic equation for zero-sum games
The ergodic equation is a basic tool in the study of mean-payoff stochastic
games. Its solvability entails that the mean payoff is independent of the
initial state. Moreover, optimal stationary strategies are readily obtained
from its solution. In this paper, we give a general sufficient condition for
the solvability of the ergodic equation, for a game with finite state space but
arbitrary action spaces. This condition involves a pair of directed hypergraphs
depending only on the ``growth at infinity'' of the Shapley operator of the
game. This refines a recent result of the authors which only applied to games
with bounded payments, as well as earlier nonlinear fixed point results for
order preserving maps, involving graph conditions.Comment: 6 pages, 1 figure, to appear in Proc. 54th IEEE Conference on
Decision and Control (CDC 2015
Minimax representation of nonexpansive functions and application to zero-sum recursive games
We show that a real-valued function on a topological vector space is
positively homogeneous of degree one and nonexpansive with respect to a weak
Minkowski norm if and only if it can be written as a minimax of linear forms
that are nonexpansive with respect to the same norm. We derive a representation
of monotone, additively and positively homogeneous functions on
spaces and on , which extend results of Kolokoltsov, Rubinov,
Singer, and others. We apply this representation to nonconvex risk measures and
to zero-sum games. We derive in particular results of representation and
polyhedral approximation for the class of Shapley operators arising from games
without instantaneous payments (Everett's recursive games)
Théorie de Perron-Frobenius non-linéaire et jeux stochastiques à somme nulle avec paiement moyen
Zero-sum stochastic games have a recursive structure encompassed in their dynamic programming operator, so-called Shapley operator. The latter is a useful tool to study the asymptotic behavior of the average payoff per time unit. Particularly, the mean payoff exists and is independent of the initial state as soon as the ergodic equation - a nonlinear eigenvalue equation involving the Shapley operator - has a solution. The solvability of the latter equation in finite dimension is a central question in nonlinear Perron-Frobenius theory, and the main focus of the present thesis. Several known classes of Shapley operators can be characterized by properties based entirely on the order structure or the metric structure of the space. We first extend this characterization to "payment-free" Shapley operators, that is, operators arising from games without stage payments. This is derived from a general minimax formula for functions homogeneous of degree one and nonexpansive with respect to a given weak Minkowski norm. Next, we address the problem of the solvability of the ergodic equation for all additive perturbations of the payment function. This problem extends the notion of ergodicity for finite Markov chains. With bounded payment function, this "ergodicity" property is characterized by the uniqueness, up to the addition by a constant, of the fixed point of a payment-free Shapley operator. We give a combinatorial solution in terms of hypergraphs to this problem, as well as other related problems of fixed-point existence, and we infer complexity results. Then, we use the theory of accretive operators to generalize the hypergraph condition to all Shapley operators, including ones for which the payment function is not bounded. Finally, we consider the problem of uniqueness, up to the addition by a constant, of the nonlinear eigenvector. We first show that uniqueness holds for a generic additive perturbation of the payments. Then, in the framework of perfect information and finite action spaces, we provide an additional geometric description of the perturbations for which uniqueness occurs. As an application, we obtain a perturbation scheme allowing one to solve degenerate instances of stochastic games by policy iteration.Les jeux stochastiques à somme nulle possèdent une structure récursive qui s'exprime dans leur opérateur de programmation dynamique, appelé opérateur de Shapley. Ce dernier permet d'étudier le comportement asymptotique de la moyenne des paiements par unité de temps. En particulier, le paiement moyen existe et ne dépend pas de l'état initial si l'équation ergodique - une équation non-linéaire aux valeurs propres faisant intervenir l'opérateur de Shapley - admet une solution. Comprendre sous quelles conditions cette équation admet une solution est un problème central de la théorie de Perron-Frobenius non-linéaire, et constitue le principal thème d'étude de cette thèse. Diverses classes connues d'opérateur de Shapley peuvent être caractérisées par des propriétés basées entièrement sur la relation d'ordre ou la structure métrique de l'espace. Nous étendons tout d'abord cette caractérisation aux opérateurs de Shapley "sans paiements", qui proviennent de jeux sans paiements instantanés. Pour cela, nous établissons une expression sous forme minimax des fonctions homogènes de degré un et non-expansives par rapport à une norme faible de Minkowski. Nous nous intéressons ensuite au problème de savoir si l'équation ergodique a une solution pour toute perturbation additive des paiements, problème qui étend la notion d'ergodicité des chaînes de Markov. Quand les paiements sont bornés, cette propriété d'"ergodicité" est caractérisée par l'unicité, à une constante additive près, du point fixe d'un opérateur de Shapley sans paiement. Nous donnons une solution combinatoire s'exprimant au moyen d'hypergraphes à ce problème, ainsi qu'à des problèmes voisins d'existence de points fixes. Puis, nous en déduisons des résultats de complexité. En utilisant la théorie des opérateurs accrétifs, nous généralisons ensuite la condition d'hypergraphes à tous types d'opérateurs de Shapley, y compris ceux provenant de jeux dont les paiements ne sont pas bornés. Dans un troisième temps, nous considérons le problème de l'unicité, à une constante additive près, du vecteur propre. Nous montrons d'abord que l'unicité a lieu pour une perturbation générique des paiements. Puis, dans le cadre des jeux à information parfaite avec un nombre fini d'actions, nous précisons la nature géométrique de l'ensemble des perturbations où se produit l'unicité. Nous en déduisons un schéma de perturbations qui permet de résoudre les instances dégénérées pour l'itération sur les politiques
Une approche opérateur accrétif pour les jeux stochastiques avec critère ergodique
National audienc
A game theory approach to the existence and uniqueness of nonlinear Perron-Frobenius eigenvectors
International audienceWe establish a generalized Perron-Frobenius theorem, based on a combinatorial criterion which entails the existence of an eigenvector for any nonlinear order-preserving and positively homogeneous map acting on the open orthant . This criterion involves dominions, i.e., sets of states that can be made invariant by one player in a two-person game that only depends on the behavior of "at infinity". In this way, we characterize the situation in which for all , the "slice space" is bounded in Hilbert's projective metric, or, equivalently, for all uniform perturbations of , all the orbits of are bounded in Hilbert's projective metric. This solves a problem raised by Gaubert and Gunawardena (Trans. AMS, 2004). We also show that the uniqueness of an eigenvector is characterized by a dominion condition, involving a different game depending now on the local behavior of near an eigenvector. We show that the dominion conditions can be verified by directed hypergraph methods. We finally illustrate these results by considering specific classes of nonlinear maps, including Shapley operators, generalized means and nonnegative tensors
Ergodicity Condition for Zero-Sum Games
Minisymposia "Dynamic Games and Operators"International audienceFor zero-sum repeated stochastic games, basic questions are whether the mean payoff per time unit is independent of the initial state, and whether this property is robust to perturbations of rewards. In the case of finite action spaces, we show that the answer to both questions is positive if and only if an ergodicity condition involving fixed points of the recession function of the Shapley operator or reachability in directed hypergraphs is satisfied
Qualification Conditions in Semi-algebraic Programming
International audienceFor an arbitrary finite family of semi-algebraic/definable functions, we consider the corresponding inequality constraint set and we study qualification conditions for perturbations of this set. In particular we prove that all positive diagonal perturbations, save perhaps a finite number of them, ensure that any point within the feasible set satisfies Mangasarian-Fromovitz constraint qualification. Using the Milnor-Thom theorem, we provide a bound for the number of singular perturbations when the constraints are polynomial functions. Examples show that the order of magnitude of our exponential bound is relevant. Our perturbation approach provides a simple protocol to build sequences of "regular" problems approximating an arbitrary semi-algebraic/definable problem. Applications to sequential quadratic programming methods and sum of squares relaxation are provided