Search CORE

1,224 research outputs found

A Nested Family of $k$ -total Effective Rewards for Positional Games

Author: Boros Endre
Elbassioni Khaled
Gurvich Vladimir
Makino Kazuhisa
Publication venue
Publication date: 01/01/2015
Field of study

We consider Gillette's two-person zero-sum stochastic games with perfect information. For each k \in \ZZ_+ we introduce an effective reward function, called

k

-total. For

k = 0

and

1

this function is known as {\it mean payoff} and {\it total reward}, respectively. We restrict our attention to the deterministic case. For all

k

, we prove the existence of a saddle point which can be realized by uniformly optimal pure stationary strategies. We also demonstrate that

k

-total reward games can be embedded into

(k+1)

-total reward games

arXiv.org e-Print Archive

Repositorium für Naturwissenschaften und Technik

On Nash-Solvability of Finite Two-Person Tight Vector Game Forms

Author: Gurvich Vladimir
Naumova Mariya
Publication venue
Publication date: 21/04/2022
Field of study

We consider finite two-person normal form games. The following four properties of their game forms are equivalent: (i) Nash-solvability, (ii) zero-sum-solvability, (iii) win-lose-solvability, and (iv) tightness. For (ii, iii, iv) this was shown by Edmonds and Fulkerson in 1970. Then, in 1975, (i) was added to this list and it was also shown that these results cannot be generalized for

n

-person case with

n > 2

. In 1990, tightness was extended to vector game forms (

v

-forms) and it was shown that such

v

-tightness and zero-sum-solvability are still equivalent, yet, do not imply Nash-solvability. These results are applicable to several classes of stochastic games with perfect information. Here we suggest one more extension of tightness introducing

v^+

-tight vector game forms (

v^+

-forms). We show that such

v^+

-tightness and Nash-solvability are equivalent in case of weakly rectangular game forms and positive cost functions. This result allows us to reduce the so-called bi-shortest path conjecture to

v^+

-tightness of

v^+

-forms. However, both (equivalent) statements remain open

arXiv.org e-Print Archive

Average-energy games

Author: Bouyer Patricia
Larsen Kim G.
Laursen Simon
Markey Nicolas
Randour Mickael
Publication venue: 'Open Publishing Association'
Publication date: 01/01/2015
Field of study

Two-player quantitative zero-sum games provide a natural framework to synthesize controllers with performance guarantees for reactive systems within an uncontrollable environment. Classical settings include mean-payoff games, where the objective is to optimize the long-run average gain per action, and energy games, where the system has to avoid running out of energy. We study average-energy games, where the goal is to optimize the long-run average of the accumulated energy. We show that this objective arises naturally in several applications, and that it yields interesting connections with previous concepts in the literature. We prove that deciding the winner in such games is in NP inter coNP and at least as hard as solving mean-payoff games, and we establish that memoryless strategies suffice to win. We also consider the case where the system has to minimize the average-energy while maintaining the accumulated energy within predefined bounds at all times: this corresponds to operating with a finite-capacity storage for energy. We give results for one-player and two-player games, and establish complexity bounds and memory requirements.Comment: In Proceedings GandALF 2015, arXiv:1509.0685

arXiv.org e-Print Archive

University of Liverpool Repository

Directory of Open Access Journals

VBN

Modeling Mutual Influence in Multi-Agent Reinforcement Learning

Author: Wen Ying
Publication venue: UCL (University College London)
Publication date: 28/08/2020
Field of study

In multi-agent systems (MAS), agents rarely act in isolation but tend to achieve their goals through interactions with other agents. To be able to achieve their ultimate goals, individual agents should actively evaluate the impacts on themselves of other agents' behaviors before they decide which actions to take. The impacts are reciprocal, and it is of great interest to model the mutual influence of agent's impacts with one another when they are observing the environment or taking actions in the environment. In this thesis, assuming that the agents are aware of each other's existence and their potential impact on themselves, I develop novel multi-agent reinforcement learning (MARL) methods that can measure the mutual influence between agents to shape learning. The first part of this thesis outlines the framework of recursive reasoning in deep multi-agent reinforcement learning. I hypothesize that it is beneficial for each agent to consider how other agents react to their behavior. I start from Probabilistic Recursive Reasoning (PR2) using level-1 reasoning and adopt variational Bayes methods to approximate the opponents' conditional policies. Each agent shapes the individual Q-value by marginalizing the conditional policies in the joint Q-value and finding the best response to improving their policies. I further extend PR2 to Generalized Recursive Reasoning (GR2) with different hierarchical levels of rationality. GR2 enables agents to possess various levels of thinking ability, thereby allowing higher-level agents to best respond to less sophisticated learners. The first part of the thesis shows that eliminating the joint Q-value to an individual Q-value via explicitly recursive reasoning would benefit the learning. In the second part of the thesis, in reverse, I measure the mutual influence by approximating the joint Q-value based on the individual Q-values. I establish Q-DPP, an extension of the Determinantal Point Process (DPP) with partition constraints, and apply it to multi-agent learning as a function approximator for the centralized value function. An attractive property of using Q-DPP is that when it reaches the optimum value, it can offer a natural factorization of the centralized value function, representing both quality (maximizing reward) and diversity (different behaviors). In the third part of the thesis, I depart from the action-level mutual influence and build a policy-space meta-game to analyze agents' relationship between adaptive policies. I present a Multi-Agent Trust Region Learning (MATRL) algorithm that augments single-agent trust region policy optimization with a weak stable fixed point approximated by the policy-space meta-game. The algorithm aims to find a game-theoretic mechanism to adjust the policy optimization steps that force the learning of all agents toward the stable point

UCL Discovery

Evolutionary games on graphs

Author: Abramson
Ahmed
Aktipis
Albert
Alexander
Alonso-Sanz
Amaral
Antal
Antal
Arthur
Ashlock
Atman
Aumann
Axelrod
Axelrod
Axelrod
Axelrod
Axelrod
Axelrod
Axelrod
Bak
Bala
Ball
Barabási
Baumol
Ben-Naim
Ben-Naim
Benaim
Benzi
Berg
Berg
Bidaux
Biely
Binder
Binmore
Binmore
Binmore
Bishop
Blarer
Blume
Blume
Blume
Blume
Blume
Boccaletti
Boerlijst
Boerlijst
Bollobás
Bollobás
Bomze
Bomze
Bradley
Bramson
Brauchli
Bray
Broom
Brosig
Brower
Brown
Busse
Camerer
Cardy
Cardy
Challet
Challet
Chandler
Chiappin
Clifford
Colman
Conlisk
Coolen
Coricelli
Cressman
Cross
Czárán
Dawkins
Derényi
Dickman
Dickman
Dieckmann
Doebeli
Domany
Dornic
Dorogovtsev
Dorogovtsev
Douglass
Drossel
Drossel
Du
Dugatkin
Duran
Durrett
Durrett
Durrett
Dutta
Ebel
Eigen
Eisert
Ellner
Equíluz
Erdős
Fehr
Field
Fisch
Fisher
Forsythe
Fort
Foster
Frachebourg
Frachebourg
Frachebourg
Frean
Freidlin
Frick
Friedman
Fudenberg
Fudenberg
Fudenberg
Fuks
Föllmer
Gambarelli
Gammaitoni
Gao
Gao
Gardiner
Gardner
Gatenby
Geritz
Gibbons
Gilpin
Gintis
Glauber
Gould
Grassberger
Grassberger
Greenberg
Grim
Grim
Guan
Gutowitz
György Szabó
Györgyi
Gábor Fáth
Gómez-Gardeñez
Güth
Haken
Hamilton
Hamilton
Hardin
Hardin
Harris
Harsanyi
Hauert
Hauert
Hauert
Hauert
Hauert
Hauert
Hauk
He
Helbing
Helbing
Helbing
Hempel
Henrich
Hinrichsen
Hofbauer
Hofbauer
Hofbauer
Hofbauer
Holland
Holley
Holme
Huberman
Ifti
Ifti
Imhof
Jackson
Jansen
Janssen
Jensen
Johnson
Johnson
Joo
Jung
Kandori
Katz
Katz
Kawasaki
Kelly
Kermack
Kerr
Killingback
Killingback
Killingback
Kim
Kim
Kinzel
Kirchkamp
Kirkup
Kittel
Kobayashi
Kraines
Kraines
Krapivsky
Kreft
Kreps
Kuperman
Kuznetsov
Ledyard
Lee
Lee
Lee
Lewontin
Lieberman
Liggett
Lim
Lin
Lindgren
Lindgren
MacLean
Macy
Marro
Marsili
Martins
Masuda
Masuda
May
Maynard Smith
Maynard Smith
Maynard Smith
Meron
Metz
Meyer
Mie¸kisz
Mie¸kisz
Mie¸kisz
Milgram
Mobilia
Molander
Monderer
Moran
Mukherji
Mézard
Nakamaru
Nakamaru
Nash
Newman
Newman
Newman
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Nowak
Ohta
Ohtsuki
Ohtsuki
Pacheco
Pacheco
Page
Page
Palla
Panchanathan
Perc
Perc
Perc
Perc
Pettit
Pfeiffer
Pfeiffer
Pikovsky
Posch
Posch
Poundstone
Prager
Provata
Ralston
Rapoport
Rasmussen
Ravasz
Reichenbach
Reichenbach
Riolo
Robson
Roca
Russell
Saijo
Samuelson
Samuelson
Santos
Santos
Santos
Santos
Santos
Santos
Sato
Sato
Schlag
Schlag
Schmittmann
Schnakenberg
Schwarz
Schweitzer
Selten
Selten
Semmann
Shapley
Sigmund
Sigmund
Silvertown
Sinervo
Skyrms
Skyrms
Stanley
Sysi-Aho
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szabó
Szolnoki
Szolnoki
Szolnoki
Szolnoki
Szolnoki
Sánchez
Tainaka
Tainaka
Tainaka
Tainaka
Tainaka
Tainaka
Tainaka
Tang
Taylor
Taylor
Thaler
Thorndike
Tomassini
Tomochi
Toral
Traulsen
Traulsen
Traulsen
Traulsen
Traulsen
Traulsen
Traulsen
Traulsen
Traulsen
Traulsen
Trivers
Trivers
Turner
Vainstein
Vainstein
Vilenkin
von Neumann
von Neumann
Vukov
Wakano
Watt
Watts
Wedekind
Weibull
Weidlich
Wiener
Wild
Wilhelm
Winfree
Wolfram
Wolfram
Wolfram
Wolpert
Wormald
Wu
Wu
Wu
Young
Zeeman
Zimmermann
Zimmermann
Zimmermann
Zimmermann
Publication venue: 'Elsevier BV'
Publication date: 24/09/2007
Field of study

Game theory is one of the key paradigms behind many scientific disciplines from biology to behavioral sciences to economics. In its evolutionary form and especially when the interacting agents are linked in a specific social network the underlying solution concepts and methods are very similar to those applied in non-equilibrium statistical physics. This review gives a tutorial-type overview of the field for physicists. The first three sections introduce the necessary background in classical and evolutionary game theory from the basic definitions to the most important results. The fourth section surveys the topological complications implied by non-mean-field-type social network structures in general. The last three sections discuss in detail the dynamic behavior of three prominent classes of models: the Prisoner's Dilemma, the Rock-Scissors-Paper game, and Competing Associations. The major theme of the review is in what sense and how the graph structure of interactions can modify and enrich the picture of long term behavioral patterns emerging in evolutionary games.Comment: Review, final version, 133 pages, 65 figure

arXiv.org e-Print Archive

Crossref

Tools and Algorithms for the Construction and Analysis of Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers

OAPEN Library

Recommended from our members

From multiscale modeling to metamodeling of geomechanics problems

Author: Wang Kun
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2019
Field of study

In numerical simulations of geomechanics problems, a grand challenge consists of overcoming the difficulties in making accurate and robust predictions by revealing the true mechanisms in particle interactions, fluid flow inside pore spaces, and hydromechanical coupling effect between the solid and fluid constituents, from microscale to mesoscale, and to macroscale. While simulation tools incorporating subscale physics can provide detailed insights and accurate material properties to macroscale simulations via computational homogenizations, these numerical simulations are often too computational demanding to be directly used across multiple scales. Recent breakthroughs of Artificial Intelligence (AI) via machine learning have great potential to overcome these barriers, as evidenced by their great success in many applications such as image recognition, natural language processing, and strategy exploration in games. The AI can achieve super-human performance level in a large number of applications, and accomplish tasks that were thought to be not feasible due to the limitations of human and previous computer algorithms. Yet, machine learning approaches can also suffer from overfitting, lack of interpretability, and lack of reliability. Thus the application of machine learning into generation of accurate and reliable surrogate constitutive models for geomaterials with multiscale and multiphysics is not trivial. For this purpose, we propose to establish an integrated modeling process for automatic designing, training, validating, and falsifying of constitutive models, or "metamodeling". This dissertation focuses on our efforts in laying down step-by-step the necessary theoretical and technical foundations for the multiscale metamodeling framework. The first step is to develop multiscale hydromechanical homogenization frameworks for both bulk granular materials and granular interfaces, with their behaviors homogenized from subscale microstructural simulations. For efficient simulations of field-scale geomechanics problems across more than two scales, we develop a hybrid data-driven method designed to capture the multiscale hydro-mechanical coupling effect of porous media with pores of various different sizes. By using sub-scale simulations to generate database to train material models, an offline homogenization procedure is used to replace the up-scaling procedure to generate path-dependent cohesive laws for localized physical discontinuities at both grain and specimen scales. To enable AI in taking over the trial-and-error tasks in the constitutive modeling process, we introduce a novel “metamodeling” framework that employs both graph theory and deep reinforcement learning (DRL) to generate accurate, physics compatible and interpretable surrogate machine learning models. The process of writing constitutive models is simplified as a sequence of forming graph edges with the goal of maximizing the model score (a function of accuracy, robustness and forward prediction quality). By using neural networks to estimate policies and state values, the computer agent is able to efficiently self-improve the constitutive models generated through self-playing. To overcome the obstacle of limited information in geomechanics, we improve the efficiency in utilization of experimental data by a multi-agent cooperative metamodeling framework to provide guidance on database generation and constitutive modeling at the same time. The modeler agent in the framework focuses on evaluating all modeling options (from domain experts’ knowledge or machine learning) in a directed multigraph of elasto-plasticity theory, and finding the optimal path that links the source of the directed graph (e.g., strain history) to the target (e.g., stress). Meanwhile, the data agent focuses on collecting data from real or virtual experiments, interacts with the modeler agent sequentially and generates the database for model calibration to optimize the prediction accuracy. Finally, we design a non-cooperative meta-modeling framework that focuses on automatically developing strategies that simultaneously generate experimental data to calibrate model parameters and explore weakness of a known constitutive model until the strengths and weaknesses of the constitutive law on the application range can be identified through competition. These tasks are enabled by a zero-sum reward system of the metamodeling game and robust adversarial reinforcement learning techniques

Columbia University Academic Commons

Recommended from our members

Dynamic Composition of Functions for Modular Learning

Author: Rosenbaum Clemens GB
Publication venue: ScholarWorks@UMass Amherst
Publication date: 26/03/2020
Field of study

Compositionality is useful to reduce the complexity of machine learning models and increase their generalization capabilities, because new problems can be linked to the composition of existing solutions. Recent work has shown that compositional approaches can offer substantial benefits over a wide variety of tasks, from multi-task learning over visual question-answering to natural language inference, among others. A key variant is functional compositionality, where a meta-learner composes different (trainable) functions into complex machine learning models. In this thesis, I generalize existing approaches to functional compositionality under the umbrella of the routing paradigm, where trainable arbitrary functions are \u27stacked\u27 to form complex machine learning models

ScholarWorks@UMass Amherst