15 research outputs found
Smooth markets: A basic mechanism for organizing gradient-based learners
With the success of modern machine learning, it is becoming increasingly
important to understand and control how learning algorithms interact.
Unfortunately, negative results from game theory show there is little hope of
understanding or controlling general n-player games. We therefore introduce
smooth markets (SM-games), a class of n-player games with pairwise zero sum
interactions. SM-games codify a common design pattern in machine learning that
includes (some) GANs, adversarial training, and other recent algorithms. We
show that SM-games are amenable to analysis and optimization using first-order
methods.Comment: 18 pages, 3 figure
Bandit Online Learning of Nash Equilibria in Monotone Games
We address online bandit learning of Nash equilibria in multi-agent convex
games. We propose an algorithm whereby each agent uses only obtained values of
her cost function at each joint played action, lacking any information of the
functional form of her cost or other agents' costs or strategies. In contrast
to past work where convergent algorithms required strong monotonicity, we prove
that the algorithm converges to a Nash equilibrium under mere monotonicity
assumption. The proposed algorithm extends the applicability of bandit learning
in several games including zero-sum convex games with possibly unbounded action
spaces, mixed extension of finite-action zero-sum games, as well as convex
games with linear coupling constraints.Comment: arXiv admin note: text overlap with arXiv:1904.0188
Alternating proximal-gradient steps for (stochastic) nonconvex-concave minimax problems
Minimax problems of the form have attracted
increased interest largely due to advances in machine learning, in particular
generative adversarial networks. These are typically trained using variants of
stochastic gradient descent for the two players.
Although convex-concave problems are well understood with many efficient
solution methods to choose from, theoretical guarantees outside of this setting
are sometimes lacking even for the simplest algorithms.
In particular, this is the case for alternating gradient descent ascent,
where the two agents take turns updating their strategies.
To partially close this gap in the literature we prove a novel global
convergence rate for the stochastic version of this method for finding a
critical point of in a setting which is not
convex-concave