Search CORE

298,019 research outputs found

Dynamics in near-potential games

Author: Candogan Utku Ozan
Koksal Asuman E.
Parrilo Pablo A
Publication venue: 'Elsevier BV'
Publication date: 01/07/2011
Field of study

We consider discrete-time learning dynamics in finite strategic form games, and show that games that are close to a potential game inherit many of the dynamical properties of potential games. We first study the evolution of the sequence of pure strategy profiles under better/best response dynamics. We show that this sequence converges to a (pure) approximate equilibrium set whose size is a function of the “distance” to a given nearby potential game. We then focus on logit response dynamics, and provide a characterization of the limiting outcome in terms of the distance of the game to a given potential game and the corresponding potential function. Finally, we turn attention to fictitious play, and establish that in near-potential games the sequence of empirical frequencies of player actions converges to a neighborhood of (mixed) equilibria, where the size of the neighborhood increases according to the distance to the set of potential games

DSpace@MIT

Dynamics in Near-Potential Games

Author: Candogan Ozan
Ozdaglar Asuman
Parrilo Pablo A.
Publication venue
Publication date: 01/01/2011
Field of study

Except for special classes of games, there is no systematic framework for analyzing the dynamical properties of multi-agent strategic interactions. Potential games are one such special but restrictive class of games that allow for tractable dynamic analysis. Intuitively, games that are "close" to a potential game should share similar properties. In this paper, we formalize and develop this idea by quantifying to what extent the dynamic features of potential games extend to "near-potential" games. We study convergence of three commonly studied classes of adaptive dynamics: discrete-time better/best response, logit response, and discrete-time fictitious play dynamics. For better/best response dynamics, we focus on the evolution of the sequence of pure strategy profiles and show that this sequence converges to a (pure) approximate equilibrium set, whose size is a function of the "distance" from a close potential game. We then study logit response dynamics and provide a characterization of the stationary distribution of this update rule in terms of the distance of the game from a close potential game and the corresponding potential function. We further show that the stochastically stable strategy profiles are pure approximate equilibria. Finally, we turn attention to fictitious play, and establish that the sequence of empirical frequencies of player actions converges to a neighborhood of (mixed) equilibria of the game, where the size of the neighborhood increases with distance of the game to a potential game. Thus, our results suggest that games that are close to a potential game inherit the dynamical properties of potential games. Since a close potential game to a given game can be found by solving a convex optimization problem, our approach also provides a systematic framework for studying convergence behavior of adaptive learning dynamics in arbitrary finite strategic form games.Comment: 42 pages, 8 figure

arXiv.org e-Print Archive

CiteSeerX

Multi-Agent Credit Assignment in Stochastic Resource Management Games

Author: Arthur
Binmore
Enda Howley
Jim Duggan
Patrick Mannion
Sam Devlin
Wolpert
Wooldridge
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2017
Field of study

Multi-Agent Systems (MAS) are a form of distributed intelligence, where multiple autonomous agents act in a common environment. Numerous complex, real world systems have been successfully optimised using Multi-Agent Reinforcement Learning (MARL) in conjunction with the MAS framework. In MARL agents learn by maximising a scalar reward signal from the environment, and thus the design of the reward function directly affects the policies learned. In this work, we address the issue of appropriate multi-agent credit assignment in stochastic resource management games. We propose two new Stochastic Games to serve as testbeds for MARL research into resource management problems: the Tragic Commons Domain and the Shepherd Problem Domain. Our empirical work evaluates the performance of two commonly used reward shaping techniques: Potential-Based Reward Shaping and difference rewards. Experimental results demonstrate that systems using appropriate reward shaping techniques for multi-agent credit assignment can achieve near optimal performance in stochastic resource management games, outperforming systems learning using unshaped local or global evaluations. We also present the first empirical investigations into the effect of expressing the same heuristic knowledge in state- or action-based formats, therefore developing insights into the design of multi-agent potential functions that will inform future work

Crossref

Irish Universities

White Rose Research Online

Access to Research at National University of Ireland, Galway

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

Author: Agarwal Rishabh
Geng Xinyang
Kumar Aviral
Levine Sergey
Tucker George
Publication venue
Publication date: 28/11/2022
Field of study

The potential of offline reinforcement learning (RL) is that high-capacity models trained on large, heterogeneous datasets can lead to agents that generalize broadly, analogously to similar advances in vision and NLP. However, recent works argue that offline RL methods encounter unique challenges to scaling up model capacity. Drawing on the learnings from these works, we re-examine previous design choices and find that with appropriate choices: ResNets, cross-entropy based distributional backups, and feature normalization, offline Q-learning algorithms exhibit strong performance that scales with model capacity. Using multi-task Atari as a testbed for scaling and generalization, we train a single policy on 40 games with near-human performance using up-to 80 million parameter networks, finding that model performance scales favorably with capacity. In contrast to prior work, we extrapolate beyond dataset performance even when trained entirely on a large (400M transitions) but highly suboptimal dataset (51% human-level performance). Compared to return-conditioned supervised approaches, offline Q-learning scales similarly with model capacity and has better performance, especially when the dataset is suboptimal. Finally, we show that offline Q-learning with a diverse dataset is sufficient to learn powerful representations that facilitate rapid transfer to novel games and fast online learning on new variations of a training game, improving over existing state-of-the-art representation learning approaches

arXiv.org e-Print Archive

Mathematical models of games of chance: Epistemological taxonomy and potential in problem-gambling research

Author: Barboianu Catalin
Publication venue
Publication date: 01/01/2015
Field of study

Games of chance are developed in their physical consumer-ready form on the basis of mathematical models, which stand as the premises of their existence and represent their physical processes. There is a prevalence of statistical and probabilistic models in the interest of all parties involved in the study of gambling – researchers, game producers and operators, and players – while functional models are of interest more to math-inclined players than problem-gambling researchers. In this paper I present a structural analysis of the knowledge attached to mathematical models of games of chance and the act of modeling, arguing that such knowledge holds potential in the prevention and cognitive treatment of excessive gambling, and I propose further research in this direction

PhilPapers

CiteSeerX

SSOAR - Social Science Open Access Repository

University of Nevada, Las Vegas Repository

Differentiable Game Mechanics

Author: Balduzzi David
Foerster Jakob
Graepel Thore
Letcher Alistair
Martens James
Racaniere Sebastien
Tuyls Karl
Publication venue
Publication date: 20/01/2019
Field of study

Deep learning is built on the foundational guarantee that gradient descent on an objective function converges to local minima. Unfortunately, this guarantee fails in settings, such as generative adversarial nets, that exhibit multiple interacting losses. The behavior of gradient-based methods in games is not well understood -- and is becoming increasingly important as adversarial and multi-objective architectures proliferate. In this paper, we develop new tools to understand and control the dynamics in n-player differentiable games. The key result is to decompose the game Jacobian into two components. The first, symmetric component, is related to potential games, which reduce to gradient descent on an implicit function. The second, antisymmetric component, relates to Hamiltonian games, a new class of games that obey a conservation law akin to conservation laws in classical mechanical systems. The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games. Basic experiments show SGA is competitive with recently proposed algorithms for finding stable fixed points in GANs -- while at the same time being applicable to, and having guarantees in, much more general cases.Comment: JMLR 2019, journal version of arXiv:1802.0564

arXiv.org e-Print Archive

UCL Discovery

On predictability of rare events leveraging social media: a machine learning perspective

Author: Bakliwal A.
Gayo-Avello D.
Go A.
Saif H.
Tumasjan A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/02/2015
Field of study

Information extracted from social media streams has been leveraged to forecast the outcome of a large number of real-world events, from political elections to stock market fluctuations. An increasing amount of studies demonstrates how the analysis of social media conversations provides cheap access to the wisdom of the crowd. However, extents and contexts in which such forecasting power can be effectively leveraged are still unverified at least in a systematic way. It is also unclear how social-media-based predictions compare to those based on alternative information sources. To address these issues, here we develop a machine learning framework that leverages social media streams to automatically identify and predict the outcomes of soccer matches. We focus in particular on matches in which at least one of the possible outcomes is deemed as highly unlikely by professional bookmakers. We argue that sport events offer a systematic approach for testing the predictive power of social media, and allow to compare such power against the rigorous baselines set by external sources. Despite such strict baselines, our framework yields above 8% marginal profit when used to inform simple betting strategies. The system is based on real-time sentiment analysis and exploits data collected immediately before the games, allowing for informed bets. We discuss the rationale behind our approach, describe the learning framework, its prediction performance and the return it provides as compared to a set of betting strategies. To test our framework we use both historical Twitter data from the 2014 FIFA World Cup games, and real-time Twitter data collected by monitoring the conversations about all soccer matches of four major European tournaments (FA Premier League, Serie A, La Liga, and Bundesliga), and the 2014 UEFA Champions League, during the period between Oct. 25th 2014 and Nov. 26th 2014.Comment: 10 pages, 10 tables, 8 figure

arXiv.org e-Print Archive

Crossref