1,224 research outputs found
A Nested Family of -total Effective Rewards for Positional Games
We consider Gillette's two-person zero-sum stochastic games with perfect
information. For each k \in \ZZ_+ we introduce an effective reward function,
called -total. For and this function is known as {\it mean
payoff} and {\it total reward}, respectively. We restrict our attention to the
deterministic case. For all , we prove the existence of a saddle point which
can be realized by uniformly optimal pure stationary strategies. We also
demonstrate that -total reward games can be embedded into -total
reward games
On Nash-Solvability of Finite Two-Person Tight Vector Game Forms
We consider finite two-person normal form games. The following four
properties of their game forms are equivalent: (i) Nash-solvability, (ii)
zero-sum-solvability, (iii) win-lose-solvability, and (iv) tightness. For (ii,
iii, iv) this was shown by Edmonds and Fulkerson in 1970. Then, in 1975, (i)
was added to this list and it was also shown that these results cannot be
generalized for -person case with . In 1990, tightness was extended
to vector game forms (-forms) and it was shown that such -tightness and
zero-sum-solvability are still equivalent, yet, do not imply Nash-solvability.
These results are applicable to several classes of stochastic games with
perfect information. Here we suggest one more extension of tightness
introducing -tight vector game forms (-forms). We show that such
-tightness and Nash-solvability are equivalent in case of weakly
rectangular game forms and positive cost functions. This result allows us to
reduce the so-called bi-shortest path conjecture to -tightness of
-forms. However, both (equivalent) statements remain open
Average-energy games
Two-player quantitative zero-sum games provide a natural framework to
synthesize controllers with performance guarantees for reactive systems within
an uncontrollable environment. Classical settings include mean-payoff games,
where the objective is to optimize the long-run average gain per action, and
energy games, where the system has to avoid running out of energy.
We study average-energy games, where the goal is to optimize the long-run
average of the accumulated energy. We show that this objective arises naturally
in several applications, and that it yields interesting connections with
previous concepts in the literature. We prove that deciding the winner in such
games is in NP inter coNP and at least as hard as solving mean-payoff games,
and we establish that memoryless strategies suffice to win. We also consider
the case where the system has to minimize the average-energy while maintaining
the accumulated energy within predefined bounds at all times: this corresponds
to operating with a finite-capacity storage for energy. We give results for
one-player and two-player games, and establish complexity bounds and memory
requirements.Comment: In Proceedings GandALF 2015, arXiv:1509.0685
Modeling Mutual Influence in Multi-Agent Reinforcement Learning
In multi-agent systems (MAS), agents rarely act in isolation but tend to achieve their goals through interactions with other agents. To be able to achieve their ultimate goals, individual agents should actively evaluate the impacts on themselves of other agents' behaviors before they decide which actions to take. The impacts are reciprocal, and it is of great interest to model the mutual influence of agent's impacts with one another when they are observing the environment or taking actions in the environment. In this thesis, assuming that the agents are aware of each other's existence and their potential impact on themselves, I develop novel multi-agent reinforcement learning (MARL) methods that can measure the mutual influence between agents to shape learning. The first part of this thesis outlines the framework of recursive reasoning in deep multi-agent reinforcement learning. I hypothesize that it is beneficial for each agent to consider how other agents react to their behavior. I start from Probabilistic Recursive Reasoning (PR2) using level-1 reasoning and adopt variational Bayes methods to approximate the opponents' conditional policies. Each agent shapes the individual Q-value by marginalizing the conditional policies in the joint Q-value and finding the best response to improving their policies. I further extend PR2 to Generalized Recursive Reasoning (GR2) with different hierarchical levels of rationality. GR2 enables agents to possess various levels of thinking ability, thereby allowing higher-level agents to best respond to less sophisticated learners. The first part of the thesis shows that eliminating the joint Q-value to an individual Q-value via explicitly recursive reasoning would benefit the learning. In the second part of the thesis, in reverse, I measure the mutual influence by approximating the joint Q-value based on the individual Q-values. I establish Q-DPP, an extension of the Determinantal Point Process (DPP) with partition constraints, and apply it to multi-agent learning as a function approximator for the centralized value function. An attractive property of using Q-DPP is that when it reaches the optimum value, it can offer a natural factorization of the centralized value function, representing both quality (maximizing reward) and diversity (different behaviors). In the third part of the thesis, I depart from the action-level mutual influence and build a policy-space meta-game to analyze agents' relationship between adaptive policies. I present a Multi-Agent Trust Region Learning (MATRL) algorithm that augments single-agent trust region policy optimization with a weak stable fixed point approximated by the policy-space meta-game. The algorithm aims to find a game-theoretic mechanism to adjust the policy optimization steps that force the learning of all agents toward the stable point
Evolutionary games on graphs
Game theory is one of the key paradigms behind many scientific disciplines
from biology to behavioral sciences to economics. In its evolutionary form and
especially when the interacting agents are linked in a specific social network
the underlying solution concepts and methods are very similar to those applied
in non-equilibrium statistical physics. This review gives a tutorial-type
overview of the field for physicists. The first three sections introduce the
necessary background in classical and evolutionary game theory from the basic
definitions to the most important results. The fourth section surveys the
topological complications implied by non-mean-field-type social network
structures in general. The last three sections discuss in detail the dynamic
behavior of three prominent classes of models: the Prisoner's Dilemma, the
Rock-Scissors-Paper game, and Competing Associations. The major theme of the
review is in what sense and how the graph structure of interactions can modify
and enrich the picture of long term behavioral patterns emerging in
evolutionary games.Comment: Review, final version, 133 pages, 65 figure
Tools and Algorithms for the Construction and Analysis of Systems
This open access two-volume set constitutes the proceedings of the 27th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2021, which was held during March 27 – April 1, 2021, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2021. The conference was planned to take place in Luxembourg and changed to an online format due to the COVID-19 pandemic. The total of 41 full papers presented in the proceedings was carefully reviewed and selected from 141 submissions. The volume also contains 7 tool papers; 6 Tool Demo papers, 9 SV-Comp Competition Papers. The papers are organized in topical sections as follows: Part I: Game Theory; SMT Verification; Probabilities; Timed Systems; Neural Networks; Analysis of Network Communication. Part II: Verification Techniques (not SMT); Case Studies; Proof Generation/Validation; Tool Papers; Tool Demo Papers; SV-Comp Tool Competition Papers
Recommended from our members
From multiscale modeling to metamodeling of geomechanics problems
In numerical simulations of geomechanics problems, a grand challenge consists of overcoming the difficulties in making accurate and robust predictions by revealing the true mechanisms in particle interactions, fluid flow inside pore spaces, and hydromechanical coupling effect between the solid and fluid constituents, from microscale to mesoscale, and to macroscale. While simulation tools incorporating subscale physics can provide detailed insights and accurate material properties to macroscale simulations via computational homogenizations, these numerical simulations are often too computational demanding to be directly used across multiple scales. Recent breakthroughs of Artificial Intelligence (AI) via machine learning have great potential to overcome these barriers, as evidenced by their great success in many applications such as image recognition, natural language processing, and strategy exploration in games. The AI can achieve super-human performance level in a large number of applications, and accomplish tasks that were thought to be not feasible due to the limitations of human and previous computer algorithms. Yet, machine learning approaches can also suffer from overfitting, lack of interpretability, and lack of reliability. Thus the application of machine learning into generation of accurate and reliable surrogate constitutive models for geomaterials with multiscale and multiphysics is not trivial. For this purpose, we propose to establish an integrated modeling process for automatic designing, training, validating, and falsifying of constitutive models, or "metamodeling". This dissertation focuses on our efforts in laying down step-by-step the necessary theoretical and technical foundations for the multiscale metamodeling framework.
The first step is to develop multiscale hydromechanical homogenization frameworks for both bulk granular materials and granular interfaces, with their behaviors homogenized from subscale microstructural simulations. For efficient simulations of field-scale geomechanics problems across more than two scales, we develop a hybrid data-driven method designed to capture the multiscale hydro-mechanical coupling effect of porous media with pores of various different sizes. By using sub-scale simulations to generate database to train material models, an offline homogenization procedure is used to replace the up-scaling procedure to generate path-dependent cohesive laws for localized physical discontinuities at both grain and specimen scales.
To enable AI in taking over the trial-and-error tasks in the constitutive modeling process, we introduce a novel “metamodeling” framework that employs both graph theory and deep reinforcement learning (DRL) to generate accurate, physics compatible and interpretable surrogate machine learning models. The process of writing constitutive models is simplified as a sequence of forming graph edges with the goal of maximizing the model score (a function of accuracy, robustness and forward prediction quality). By using neural networks to estimate policies and state values, the computer agent is able to efficiently self-improve the constitutive models generated through self-playing.
To overcome the obstacle of limited information in geomechanics, we improve the efficiency in utilization of experimental data by a multi-agent cooperative metamodeling framework to provide guidance on database generation and constitutive modeling at the same time. The modeler agent in the framework focuses on evaluating all modeling options (from domain experts’ knowledge or machine learning) in a directed multigraph of elasto-plasticity theory, and finding the optimal path that links the source of the directed graph (e.g., strain history) to the target (e.g., stress). Meanwhile, the data agent focuses on collecting data from real or virtual experiments, interacts with the modeler agent sequentially and generates the database for model calibration to optimize the prediction accuracy. Finally, we design a non-cooperative meta-modeling framework that focuses on automatically developing strategies that simultaneously generate experimental data to calibrate model parameters and explore weakness of a known constitutive model until the strengths and weaknesses of the constitutive law on the application range can be identified through competition. These tasks are enabled by a zero-sum reward system of the metamodeling game and robust adversarial reinforcement learning techniques
Recommended from our members
Dynamic Composition of Functions for Modular Learning
Compositionality is useful to reduce the complexity of machine learning models and increase their generalization capabilities, because new problems can be linked to the composition of existing solutions. Recent work has shown that compositional approaches can offer substantial benefits over a wide variety of tasks, from multi-task learning over visual question-answering to natural language inference, among others. A key variant is functional compositionality, where a meta-learner composes different (trainable) functions into complex machine learning models. In this thesis, I generalize existing approaches to functional compositionality under the umbrella of the routing paradigm, where trainable arbitrary functions are \u27stacked\u27 to form complex machine learning models
- …