390 research outputs found
Open-ended Learning in Symmetric Zero-sum Games
Zero-sum games such as chess and poker are, abstractly, functions that
evaluate pairs of agents, for example labeling them `winner' and `loser'. If
the game is approximately transitive, then self-play generates sequences of
agents of increasing strength. However, nontransitive games, such as
rock-paper-scissors, can exhibit strategic cycles, and there is no longer a
clear objective -- we want agents to increase in strength, but against whom is
unclear. In this paper, we introduce a geometric framework for formulating
agent objectives in zero-sum games, in order to construct adaptive sequences of
objectives that yield open-ended learning. The framework allows us to reason
about population performance in nontransitive games, and enables the
development of a new algorithm (rectified Nash response, PSRO_rN) that uses
game-theoretic niching to construct diverse populations of effective agents,
producing a stronger set of agents than existing algorithms. We apply PSRO_rN
to two highly nontransitive resource allocation games and find that PSRO_rN
consistently outperforms the existing alternatives.Comment: ICML 2019, final versio
A new algorithm for approaching Nash equilibrium and Kalai Smoridinsky solution
International audienceIn the present paper, a new formulation of Nash games is proposed for solving general multi-objective optimization problems. The main idea of this approach is to split the optimization variables which allow us to determine numerically the strategies between two players. The first player minimizes his cost function using the variables of the first table P, the second player, using the second table Q. The original contribution of this work concerns the construction of the two tables of allocations that lead to a Nash equilibrium on the Pareto front. On the other hand, we search P and Q that lead to a solution which is both a Nash equilibrium and a Kalai Smorodinsky solution. For this, we proposed and tried out successfully two algorithms which calculate P, Q and their associated Nash equilibrium, by using some extension of Normal Boundary Intersection approach (NBI)
Qualitative Characteristics and Quantitative Measures of Solution's Reliability in Discrete Optimization: Traditional Analytical Approaches, Innovative Computational Methods and Applicability
The purpose of this thesis is twofold. The first and major part is devoted to
sensitivity analysis of various discrete optimization problems while the second
part addresses methods applied for calculating measures of solution stability
and solving multicriteria discrete optimization problems.
Despite numerous approaches to stability analysis of discrete optimization
problems two major directions can be single out: quantitative and qualitative.
Qualitative sensitivity analysis is conducted for multicriteria discrete optimization
problems with minisum, minimax and minimin partial criteria. The main
results obtained here are necessary and sufficient conditions for different stability
types of optimal solutions (or a set of optimal solutions) of the considered
problems.
Within the framework of quantitative direction various measures of solution
stability are investigated. A formula for a quantitative characteristic called
stability radius is obtained for the generalized equilibrium situation invariant
to changes of game parameters in the case of the H¨older metric. Quality of the
problem solution can also be described in terms of robustness analysis. In this
work the concepts of accuracy and robustness tolerances are presented for a
strategic game with a finite number of players where initial coefficients (costs)
of linear payoff functions are subject to perturbations.
Investigation of stability radius also aims to devise methods for its calculation.
A new metaheuristic approach is derived for calculation of stability
radius of an optimal solution to the shortest path problem. The main advantage
of the developed method is that it can be potentially applicable for
calculating stability radii of NP-hard problems.
The last chapter of the thesis focuses on deriving innovative methods based
on interactive optimization approach for solving multicriteria combinatorial
optimization problems. The key idea of the proposed approach is to utilize
a parameterized achievement scalarizing function for solution calculation and
to direct interactive procedure by changing weighting coefficients of this function.
In order to illustrate the introduced ideas a decision making process is
simulated for three objective median location problem.
The concepts, models, and ideas collected and analyzed in this thesis create
a good and relevant grounds for developing more complicated and integrated
models of postoptimal analysis and solving the most computationally challenging
problems related to it.Siirretty Doriast
Deception in Game Theory: A Survey and Multiobjective Model
Game theory is the study of mathematical models of conflict. It provides tools for analyzing dynamic interactions between multiple agents and (in some cases) across multiple interactions. This thesis contains two scholarly articles. The first article is a survey of game-theoretic models of deception. The survey describes the ways researchers use game theory to measure the practicality of deception, model the mechanisms for performing deception, analyze the outcomes of deception, and respond to, or mitigate the effects of deception. The survey highlights several gaps in the literature. One important gap concerns the benefit-cost-risk trade-off made during deception planning. To address this research gap, the second article introduces a novel approach for modeling these trade-offs. The approach uses a game theoretic model of deception to define a new multiobjective optimization problem called the deception design problem (DDP). Solutions to the DDP provide courses of deceptive action that are efficient in terms of their benefit, cost, and risk to the deceiver. A case study based on the output of an air-to-air combat simulator demonstrates the DDP in a 7 x 7 normal form game. This approach is the first to evaluate benefit, cost, and risk in a single game theoretic model of deception
Opponent Learning Awareness and Modelling in Multi-Objective Normal Form Games
Many real-world multi-agent interactions consider multiple distinct criteria,
i.e. the payoffs are multi-objective in nature. However, the same
multi-objective payoff vector may lead to different utilities for each
participant. Therefore, it is essential for an agent to learn about the
behaviour of other agents in the system. In this work, we present the first
study of the effects of such opponent modelling on multi-objective multi-agent
interactions with non-linear utilities. Specifically, we consider two-player
multi-objective normal form games with non-linear utility functions under the
scalarised expected returns optimisation criterion. We contribute novel
actor-critic and policy gradient formulations to allow reinforcement learning
of mixed strategies in this setting, along with extensions that incorporate
opponent policy reconstruction and learning with opponent learning awareness
(i.e., learning while considering the impact of one's policy when anticipating
the opponent's learning step). Empirical results in five different MONFGs
demonstrate that opponent learning awareness and modelling can drastically
alter the learning dynamics in this setting. When equilibria are present,
opponent modelling can confer significant benefits on agents that implement it.
When there are no Nash equilibria, opponent learning awareness and modelling
allows agents to still converge to meaningful solutions that approximate
equilibria.Comment: Under review since 14 November 202
Game theory approach to competitive economic dynamics
This thesis deals both with non-cooperative and cooperative games in order to apply the mathematical theory to competitive dynamics arising from economics, particularly quantity competition in oligopolies and pollution reduction models in IEA (International Environmental Agreements)
- …