298,019 research outputs found
Dynamics in near-potential games
We consider discrete-time learning dynamics in finite strategic form games, and show that games that are close to a potential game inherit many of the dynamical properties of potential games. We first study the evolution of the sequence of pure strategy profiles under better/best response dynamics. We show that this sequence converges to a (pure) approximate equilibrium set whose size is a function of the “distance” to a given nearby potential game. We then focus on logit response dynamics, and provide a characterization of the limiting outcome in terms of the distance of the game to a given potential game and the corresponding potential function. Finally, we turn attention to fictitious play, and establish that in near-potential games the sequence of empirical frequencies of player actions converges to a neighborhood of (mixed) equilibria, where the size of the neighborhood increases according to the distance to the set of potential games
Dynamics in Near-Potential Games
Except for special classes of games, there is no systematic framework for
analyzing the dynamical properties of multi-agent strategic interactions.
Potential games are one such special but restrictive class of games that allow
for tractable dynamic analysis. Intuitively, games that are "close" to a
potential game should share similar properties. In this paper, we formalize and
develop this idea by quantifying to what extent the dynamic features of
potential games extend to "near-potential" games. We study convergence of three
commonly studied classes of adaptive dynamics: discrete-time better/best
response, logit response, and discrete-time fictitious play dynamics. For
better/best response dynamics, we focus on the evolution of the sequence of
pure strategy profiles and show that this sequence converges to a (pure)
approximate equilibrium set, whose size is a function of the "distance" from a
close potential game. We then study logit response dynamics and provide a
characterization of the stationary distribution of this update rule in terms of
the distance of the game from a close potential game and the corresponding
potential function. We further show that the stochastically stable strategy
profiles are pure approximate equilibria. Finally, we turn attention to
fictitious play, and establish that the sequence of empirical frequencies of
player actions converges to a neighborhood of (mixed) equilibria of the game,
where the size of the neighborhood increases with distance of the game to a
potential game. Thus, our results suggest that games that are close to a
potential game inherit the dynamical properties of potential games. Since a
close potential game to a given game can be found by solving a convex
optimization problem, our approach also provides a systematic framework for
studying convergence behavior of adaptive learning dynamics in arbitrary finite
strategic form games.Comment: 42 pages, 8 figure
Multi-Agent Credit Assignment in Stochastic Resource Management Games
Multi-Agent Systems (MAS) are a form of distributed intelligence, where multiple autonomous agents act in a common environment. Numerous complex, real world systems have been successfully optimised using Multi-Agent Reinforcement Learning (MARL) in conjunction with the MAS framework. In MARL agents learn by maximising a scalar reward signal from the environment, and thus the design of the reward function directly affects the policies learned. In this work, we address the issue of appropriate multi-agent credit assignment in stochastic resource management games. We propose two new Stochastic Games to serve as testbeds for MARL research into resource management problems: the Tragic Commons Domain and the Shepherd Problem Domain. Our empirical work evaluates the performance of two commonly used reward shaping techniques: Potential-Based Reward Shaping and difference rewards. Experimental results demonstrate that systems using appropriate reward shaping techniques for multi-agent credit assignment can achieve near optimal performance in stochastic resource management games, outperforming systems learning using unshaped local or global evaluations. We also present the first empirical investigations into the effect of expressing the same heuristic knowledge in state- or action-based formats, therefore developing insights into the design of multi-agent potential functions that will inform future work
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes
The potential of offline reinforcement learning (RL) is that high-capacity
models trained on large, heterogeneous datasets can lead to agents that
generalize broadly, analogously to similar advances in vision and NLP. However,
recent works argue that offline RL methods encounter unique challenges to
scaling up model capacity. Drawing on the learnings from these works, we
re-examine previous design choices and find that with appropriate choices:
ResNets, cross-entropy based distributional backups, and feature normalization,
offline Q-learning algorithms exhibit strong performance that scales with model
capacity. Using multi-task Atari as a testbed for scaling and generalization,
we train a single policy on 40 games with near-human performance using up-to 80
million parameter networks, finding that model performance scales favorably
with capacity. In contrast to prior work, we extrapolate beyond dataset
performance even when trained entirely on a large (400M transitions) but highly
suboptimal dataset (51% human-level performance). Compared to
return-conditioned supervised approaches, offline Q-learning scales similarly
with model capacity and has better performance, especially when the dataset is
suboptimal. Finally, we show that offline Q-learning with a diverse dataset is
sufficient to learn powerful representations that facilitate rapid transfer to
novel games and fast online learning on new variations of a training game,
improving over existing state-of-the-art representation learning approaches
Mathematical models of games of chance: Epistemological taxonomy and potential in problem-gambling research
Games of chance are developed in their physical consumer-ready form on the basis of mathematical models, which stand as the premises of their existence and represent their physical processes. There is a prevalence of statistical and probabilistic models in the interest of all parties involved in the study of gambling – researchers, game producers and operators, and players – while functional models are of interest more to math-inclined players than problem-gambling researchers. In this paper I present a structural analysis of the knowledge attached to mathematical models of games of chance and the act of modeling, arguing that such knowledge holds potential in the prevention and cognitive treatment of excessive gambling, and I propose further research in this direction
Differentiable Game Mechanics
Deep learning is built on the foundational guarantee that gradient descent on
an objective function converges to local minima. Unfortunately, this guarantee
fails in settings, such as generative adversarial nets, that exhibit multiple
interacting losses. The behavior of gradient-based methods in games is not well
understood -- and is becoming increasingly important as adversarial and
multi-objective architectures proliferate. In this paper, we develop new tools
to understand and control the dynamics in n-player differentiable games.
The key result is to decompose the game Jacobian into two components. The
first, symmetric component, is related to potential games, which reduce to
gradient descent on an implicit function. The second, antisymmetric component,
relates to Hamiltonian games, a new class of games that obey a conservation law
akin to conservation laws in classical mechanical systems. The decomposition
motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding
stable fixed points in differentiable games. Basic experiments show SGA is
competitive with recently proposed algorithms for finding stable fixed points
in GANs -- while at the same time being applicable to, and having guarantees
in, much more general cases.Comment: JMLR 2019, journal version of arXiv:1802.0564
On predictability of rare events leveraging social media: a machine learning perspective
Information extracted from social media streams has been leveraged to
forecast the outcome of a large number of real-world events, from political
elections to stock market fluctuations. An increasing amount of studies
demonstrates how the analysis of social media conversations provides cheap
access to the wisdom of the crowd. However, extents and contexts in which such
forecasting power can be effectively leveraged are still unverified at least in
a systematic way. It is also unclear how social-media-based predictions compare
to those based on alternative information sources. To address these issues,
here we develop a machine learning framework that leverages social media
streams to automatically identify and predict the outcomes of soccer matches.
We focus in particular on matches in which at least one of the possible
outcomes is deemed as highly unlikely by professional bookmakers. We argue that
sport events offer a systematic approach for testing the predictive power of
social media, and allow to compare such power against the rigorous baselines
set by external sources. Despite such strict baselines, our framework yields
above 8% marginal profit when used to inform simple betting strategies. The
system is based on real-time sentiment analysis and exploits data collected
immediately before the games, allowing for informed bets. We discuss the
rationale behind our approach, describe the learning framework, its prediction
performance and the return it provides as compared to a set of betting
strategies. To test our framework we use both historical Twitter data from the
2014 FIFA World Cup games, and real-time Twitter data collected by monitoring
the conversations about all soccer matches of four major European tournaments
(FA Premier League, Serie A, La Liga, and Bundesliga), and the 2014 UEFA
Champions League, during the period between Oct. 25th 2014 and Nov. 26th 2014.Comment: 10 pages, 10 tables, 8 figure
- …