Search CORE

1,226,303 research outputs found

Note on a Zero-Sum Problem

Author: Gao W.
Publication venue: Published by Elsevier Inc.
Publication date: 31/08/2001
Field of study

Adaptive Consensus and Parameter Estimation of Multi-Agent Systems with An Uncertain Leader

Author: Meng Xiangyu
Wang Shimin
Publication venue
Publication date: 01/01/2020
Field of study

In this note, the problem of simultaneous leader-following consensus and parameter estimation is studied for a class of multi-agent systems subject to an uncertain leader system. The leader system is described by a sum of sinusoids with unknown amplitudes, frequencies and phases. A distributed adaptive observer is established for each agent to estimate the unknown frequencies of the leader. It is shown that if the signal of the leader is sufficiently rich, the estimation errors of the unknown frequencies converge to zero asymptotically for all the agents. Based on the designed distributed adaptive observer, a distributed adaptive control law is synthesized for each agent to solve the leader-following consensus problem.Comment: 8 pag

arXiv.org e-Print Archive

Louisiana State University

Corruption-robust offline two-player zero-sum Markov games

Author: Mandal Debmalya
Nika Andi
Radanovic Goran
Singla Adish
Publication venue: PMLR
Publication date: 01/01/2024
Field of study

We study data corruption robustness in offline two-player zero-sum Markov games. Given a dataset of realized trajectories of two players, an adversary is allowed to modify an ε-fraction of it. The learner’s goal is to identify an approximate Nash Equilibrium policy pair from the corrupted data. We consider this problem in linear Markov games under different degrees of data coverage and corruption. We start by providing an information-theoretic lower bound on the suboptimality gap of any learner. Next, we propose robust versions of the Pessimistic Minimax Value Iteration algorithm (Zhong et al., 2022), both under coverage on the corrupted data and under coverage only on the clean data, and show that they achieve (near)-optimal suboptimality gap bounds with respect to ε. We note that we are the first to provide such a characterization of the problem of learning approximate Nash Equilibrium policies in offline two-player zero-sum Markov games under data corruption

Warwick Research Archives Portal Repository

Controlling a Random Population is EXPTIME-hard

Author: Mascle Corto
Shirmohammadi Mahsa
Totzke Patrick
Publication venue
Publication date: 13/09/2019
Field of study

Bertrand et al. [1] (LMCS 2019) describe two-player zero-sum games in which one player tries to achieve a reachability objective in

n

games (on the same finite arena) simultaneously by broadcasting actions, and where the opponent has full control of resolving non-deterministic choices. They show EXPTIME completeness for the question if such games can be won for every number

n

of games. We consider the almost-sure variant in which the opponent randomizes their actions, and where the player tries to achieve the reachability objective eventually with probability one. The lower bound construction in [1] does not directly carry over to this randomized setting. In this note we show EXPTIME hardness for the almost-sure problem by reduction from Countdown Games

arXiv.org e-Print Archive

University of Liverpool Repository

A note on large deviations for interacting particle dynamics for finding mixed equilibria in zero-sum games

Author: Nilsson Viktor
Nyquist Pierre
Publication venue
Publication date: 15/07/2022
Field of study

Finding equilibria points in continuous minimax games has become a key problem within machine learning, in part due to its connection to the training of generative adversarial networks. Because of existence and robustness issues, recent developments have shifted from pure equilibria to focusing on mixed equilibria points. In this note we consider a method proposed by Domingo-Enrich et al. for finding mixed equilibria in two-layer zero-sum games. The method is based on entropic regularisation and the two competing strategies are represented by two sets of interacting particles. We show that the sequence of empirical measures of the particle system satisfies a large deviation principle as the number of particles grows to infinity, and how this implies convergence of the empirical measure and the associated Nikaid\^o-Isoda error, complementing existing law of large numbers results.Comment: Revised section

arXiv.org e-Print Archive