1,483 research outputs found

    Mean Field Equilibrium in Dynamic Games with Complementarities

    Full text link
    We study a class of stochastic dynamic games that exhibit strategic complementarities between players; formally, in the games we consider, the payoff of a player has increasing differences between her own state and the empirical distribution of the states of other players. Such games can be used to model a diverse set of applications, including network security models, recommender systems, and dynamic search in markets. Stochastic games are generally difficult to analyze, and these difficulties are only exacerbated when the number of players is large (as might be the case in the preceding examples). We consider an approximation methodology called mean field equilibrium to study these games. In such an equilibrium, each player reacts to only the long run average state of other players. We find necessary conditions for the existence of a mean field equilibrium in such games. Furthermore, as a simple consequence of this existence theorem, we obtain several natural monotonicity properties. We show that there exist a "largest" and a "smallest" equilibrium among all those where the equilibrium strategy used by a player is nondecreasing, and we also show that players converge to each of these equilibria via natural myopic learning dynamics; as we argue, these dynamics are more reasonable than the standard best response dynamics. We also provide sensitivity results, where we quantify how the equilibria of such games move in response to changes in parameters of the game (e.g., the introduction of incentives to players).Comment: 56 pages, 5 figure

    Tropically convex constraint satisfaction

    Full text link
    A semilinear relation S is max-closed if it is preserved by taking the componentwise maximum. The constraint satisfaction problem for max-closed semilinear constraints is at least as hard as determining the winner in Mean Payoff Games, a notorious problem of open computational complexity. Mean Payoff Games are known to be in the intersection of NP and co-NP, which is not known for max-closed semilinear constraints. Semilinear relations that are max-closed and additionally closed under translations have been called tropically convex in the literature. One of our main results is a new duality for open tropically convex relations, which puts the CSP for tropically convex semilinaer constraints in general into NP intersected co-NP. This extends the corresponding complexity result for scheduling under and-or precedence constraints, or equivalently the max-atoms problem. To this end, we present a characterization of max-closed semilinear relations in terms of syntactically restricted first-order logic, and another characterization in terms of a finite set of relations L that allow primitive positive definitions of all other relations in the class. We also present a subclass of max-closed constraints where the CSP is in P; this class generalizes the class of max-closed constraints over finite domains, and the feasibility problem for max-closed linear inequalities. Finally, we show that the class of max-closed semilinear constraints is maximal in the sense that as soon as a single relation that is not max-closed is added to L, the CSP becomes NP-hard.Comment: 29 pages, 2 figure

    Tropical polyhedra are equivalent to mean payoff games

    Full text link
    We show that several decision problems originating from max-plus or tropical convexity are equivalent to zero-sum two player game problems. In particular, we set up an equivalence between the external representation of tropical convex sets and zero-sum stochastic games, in which tropical polyhedra correspond to deterministic games with finite action spaces. Then, we show that the winning initial positions can be determined from the associated tropical polyhedron. We obtain as a corollary a game theoretical proof of the fact that the tropical rank of a matrix, defined as the maximal size of a submatrix for which the optimal assignment problem has a unique solution, coincides with the maximal number of rows (or columns) of the matrix which are linearly independent in the tropical sense. Our proofs rely on techniques from non-linear Perron-Frobenius theory.Comment: 28 pages, 5 figures; v2: updated references, added background materials and illustrations; v3: minor improvements, references update

    Policy iteration for perfect information stochastic mean payoff games with bounded first return times is strongly polynomial

    Full text link
    Recent results of Ye and Hansen, Miltersen and Zwick show that policy iteration for one or two player (perfect information) zero-sum stochastic games, restricted to instances with a fixed discount rate, is strongly polynomial. We show that policy iteration for mean-payoff zero-sum stochastic games is also strongly polynomial when restricted to instances with bounded first mean return time to a given state. The proof is based on methods of nonlinear Perron-Frobenius theory, allowing us to reduce the mean-payoff problem to a discounted problem with state dependent discount rate. Our analysis also shows that policy iteration remains strongly polynomial for discounted problems in which the discount rate can be state dependent (and even negative) at certain states, provided that the spectral radii of the nonnegative matrices associated to all strategies are bounded from above by a fixed constant strictly less than 1.Comment: 17 page

    Solving generic nonarchimedean semidefinite programs using stochastic game algorithms

    Full text link
    A general issue in computational optimization is to develop combinatorial algorithms for semidefinite programming. We address this issue when the base field is nonarchimedean. We provide a solution for a class of semidefinite feasibility problems given by generic matrices. Our approach is based on tropical geometry. It relies on tropical spectrahedra, which are defined as the images by the valuation of nonarchimedean spectrahedra. We establish a correspondence between generic tropical spectrahedra and zero-sum stochastic games with perfect information. The latter have been well studied in algorithmic game theory. This allows us to solve nonarchimedean semidefinite feasibility problems using algorithms for stochastic games. These algorithms are of a combinatorial nature and work for large instances.Comment: v1: 25 pages, 4 figures; v2: 27 pages, 4 figures, minor revisions + benchmarks added; v3: 30 pages, 6 figures, generalization to non-Metzler sign patterns + some results have been replaced by references to the companion work arXiv:1610.0674

    A Unified View of Large-scale Zero-sum Equilibrium Computation

    Full text link
    The task of computing approximate Nash equilibria in large zero-sum extensive-form games has received a tremendous amount of attention due mainly to the Annual Computer Poker Competition. Immediately after its inception, two competing and seemingly different approaches emerged---one an application of no-regret online learning, the other a sophisticated gradient method applied to a convex-concave saddle-point formulation. Since then, both approaches have grown in relative isolation with advancements on one side not effecting the other. In this paper, we rectify this by dissecting and, in a sense, unify the two views.Comment: AAAI Workshop on Computer Poker and Imperfect Informatio

    Distributed stochastic optimization via matrix exponential learning

    Get PDF
    In this paper, we investigate a distributed learning scheme for a broad class of stochastic optimization problems and games that arise in signal processing and wireless communications. The proposed algorithm relies on the method of matrix exponential learning (MXL) and only requires locally computable gradient observations that are possibly imperfect and/or obsolete. To analyze it, we introduce the notion of a stable Nash equilibrium and we show that the algorithm is globally convergent to such equilibria - or locally convergent when an equilibrium is only locally stable. We also derive an explicit linear bound for the algorithm's convergence speed, which remains valid under measurement errors and uncertainty of arbitrarily high variance. To validate our theoretical analysis, we test the algorithm in realistic multi-carrier/multiple-antenna wireless scenarios where several users seek to maximize their energy efficiency. Our results show that learning allows users to attain a net increase between 100% and 500% in energy efficiency, even under very high uncertainty.Comment: 31 pages, 3 figure
    • …
    corecore