2,733 research outputs found
Theoretical and Practical Advances on Smoothing for Extensive-Form Games
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective in solving large-scale two-player zero-sum
extensive-form games. The convergence rates of these methods depend heavily on
the properties of the distance-generating function that they are based on. We
investigate the acceleration of first-order methods for solving extensive-form
games through better design of the dilated entropy function---a class of
distance-generating functions related to the domains associated with the
extensive-form games. By introducing a new weighting scheme for the dilated
entropy function, we develop the first distance-generating function for the
strategy spaces of sequential games that has no dependence on the branching
factor of the player. This result improves the convergence rate of several
first-order methods by a factor of , where is the branching
factor of the player, and is the depth of the game tree.
Thus far, counterfactual regret minimization methods have been faster in
practice, and more popular, than first-order methods despite their
theoretically inferior convergence rates. Using our new weighting scheme and
practical tuning we show that, for the first time, the excessive gap technique
can be made faster than the fastest counterfactual regret minimization
algorithm, CFR+, in practice
Model and Reinforcement Learning for Markov Games with Risk Preferences
We motivate and propose a new model for non-cooperative Markov game which
considers the interactions of risk-aware players. This model characterizes the
time-consistent dynamic "risk" from both stochastic state transitions (inherent
to the game) and randomized mixed strategies (due to all other players). An
appropriate risk-aware equilibrium concept is proposed and the existence of
such equilibria is demonstrated in stationary strategies by an application of
Kakutani's fixed point theorem. We further propose a simulation-based
Q-learning type algorithm for risk-aware equilibrium computation. This
algorithm works with a special form of minimax risk measures which can
naturally be written as saddle-point stochastic optimization problems, and
covers many widely investigated risk measures. Finally, the almost sure
convergence of this simulation-based algorithm to an equilibrium is
demonstrated under some mild conditions. Our numerical experiments on a two
player queuing game validate the properties of our model and algorithm, and
demonstrate their worth and applicability in real life competitive
decision-making.Comment: 38 pages, 6 tables, 5 figure
Cooperative investment games or population games
The model of a cooperative fuzzy game is interpreted as both a population game and a cooperative investment game. Three types of core- like solutions induced by these interpretations are introduced and investigated. The interpretation of a game as a population game allows us to define sub-games. We show that, unlike the well-known Shapley- Shubik theorem on market games (Shapley-Shubik) there might be a population game such that each of its sub-games has a non-empty core and, nevertheless, it is not a market game. It turns out that, in order to be a market game, a population game needs to be also homogeneous. We also discuss some special classes of population games such as convex games, exact games, homogeneousgames and additive games.investment game, population game, fuzzy game, core-like solution, market game
A partial differential equation for the strictly quasiconvex envelope
In a series of papers Barron, Goebel, and Jensen studied Partial Differential
Equations (PDE)s for quasiconvex (QC) functions \cite{barron2012functions,
barron2012quasiconvex,barron2013quasiconvex,barron2013uniqueness}. To overcome
the lack of uniqueness for the QC PDE, they introduced a regularization: a PDE
for \e-robust QC functions, which is well-posed. Building on this work, we
introduce a stronger regularization which is amenable to numerical
approximation. We build convergent finite difference approximations, comparing
the QC envelope and the two regularization. Solutions of this PDE are strictly
convex, and smoother than the robust-QC functions.Comment: 20 pages, 6 figures, 1 tabl
An Accretive Operator Approach to Ergodic Problems for Zero-Sum Games
Mean payoff stochastic games can be studied by means of a nonlinear spectral
problem involving the Shapley operator: the ergodic equation. A solution
consists in a scalar, called the ergodic constant, and a vector, called bias.
The existence of such a pair entails that the mean payoff per time unit is
equal to the ergodic constant for any initial state, and the bias gives
stationary strategies. By exploiting two fundamental properties of Shapley
operators, monotonicity and additive homogeneity, we give a necessary and
sufficient condition for the solvability of the ergodic equation for all the
Shapley operators obtained by perturbation of the transition payments of a
given stochastic game with finite state space. If the latter condition is
satisfied, we establish that the bias is unique (up to an additive constant)
for a generic perturbation of the transition payments. To show these results,
we use the theory of accretive operators, and prove in particular some
surjectivity condition.Comment: 4 pages, 1 figure, to appear in Proc. 22nd International Symposium on
Mathematical Theory of Networks and Systems (MTNS 2016
- …