2,733 research outputs found

    Theoretical and Practical Advances on Smoothing for Extensive-Form Games

    Full text link
    Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum extensive-form games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate the acceleration of first-order methods for solving extensive-form games through better design of the dilated entropy function---a class of distance-generating functions related to the domains associated with the extensive-form games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential games that has no dependence on the branching factor of the player. This result improves the convergence rate of several first-order methods by a factor of Ω(bdd)\Omega(b^dd), where bb is the branching factor of the player, and dd is the depth of the game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than first-order methods despite their theoretically inferior convergence rates. Using our new weighting scheme and practical tuning we show that, for the first time, the excessive gap technique can be made faster than the fastest counterfactual regret minimization algorithm, CFR+, in practice

    Model and Reinforcement Learning for Markov Games with Risk Preferences

    Full text link
    We motivate and propose a new model for non-cooperative Markov game which considers the interactions of risk-aware players. This model characterizes the time-consistent dynamic "risk" from both stochastic state transitions (inherent to the game) and randomized mixed strategies (due to all other players). An appropriate risk-aware equilibrium concept is proposed and the existence of such equilibria is demonstrated in stationary strategies by an application of Kakutani's fixed point theorem. We further propose a simulation-based Q-learning type algorithm for risk-aware equilibrium computation. This algorithm works with a special form of minimax risk measures which can naturally be written as saddle-point stochastic optimization problems, and covers many widely investigated risk measures. Finally, the almost sure convergence of this simulation-based algorithm to an equilibrium is demonstrated under some mild conditions. Our numerical experiments on a two player queuing game validate the properties of our model and algorithm, and demonstrate their worth and applicability in real life competitive decision-making.Comment: 38 pages, 6 tables, 5 figure

    Cooperative investment games or population games

    Get PDF
    The model of a cooperative fuzzy game is interpreted as both a population game and a cooperative investment game. Three types of core- like solutions induced by these interpretations are introduced and investigated. The interpretation of a game as a population game allows us to define sub-games. We show that, unlike the well-known Shapley- Shubik theorem on market games (Shapley-Shubik) there might be a population game such that each of its sub-games has a non-empty core and, nevertheless, it is not a market game. It turns out that, in order to be a market game, a population game needs to be also homogeneous. We also discuss some special classes of population games such as convex games, exact games, homogeneousgames and additive games.investment game, population game, fuzzy game, core-like solution, market game

    A partial differential equation for the strictly quasiconvex envelope

    Full text link
    In a series of papers Barron, Goebel, and Jensen studied Partial Differential Equations (PDE)s for quasiconvex (QC) functions \cite{barron2012functions, barron2012quasiconvex,barron2013quasiconvex,barron2013uniqueness}. To overcome the lack of uniqueness for the QC PDE, they introduced a regularization: a PDE for \e-robust QC functions, which is well-posed. Building on this work, we introduce a stronger regularization which is amenable to numerical approximation. We build convergent finite difference approximations, comparing the QC envelope and the two regularization. Solutions of this PDE are strictly convex, and smoother than the robust-QC functions.Comment: 20 pages, 6 figures, 1 tabl

    An Accretive Operator Approach to Ergodic Problems for Zero-Sum Games

    Full text link
    Mean payoff stochastic games can be studied by means of a nonlinear spectral problem involving the Shapley operator: the ergodic equation. A solution consists in a scalar, called the ergodic constant, and a vector, called bias. The existence of such a pair entails that the mean payoff per time unit is equal to the ergodic constant for any initial state, and the bias gives stationary strategies. By exploiting two fundamental properties of Shapley operators, monotonicity and additive homogeneity, we give a necessary and sufficient condition for the solvability of the ergodic equation for all the Shapley operators obtained by perturbation of the transition payments of a given stochastic game with finite state space. If the latter condition is satisfied, we establish that the bias is unique (up to an additive constant) for a generic perturbation of the transition payments. To show these results, we use the theory of accretive operators, and prove in particular some surjectivity condition.Comment: 4 pages, 1 figure, to appear in Proc. 22nd International Symposium on Mathematical Theory of Networks and Systems (MTNS 2016
    corecore