Search CORE

320 research outputs found

Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

Author: Başar Tamer
Kakade Sham M.
Yang Lin F.
Zhang Kaiqing
Publication venue
Publication date: 14/07/2020
Field of study

Model-based reinforcement learning (RL), which finds an optimal policy using an empirical model, has long been recognized as one of the corner stones of RL. It is especially suitable for multi-agent RL (MARL), as it naturally decouples the learning and the planning phases, and avoids the non-stationarity problem when all agents are improving their policies simultaneously using samples. Though intuitive, easy-to-implement, and widely-used, the sample complexity of model-based MARL algorithms has not been fully investigated. In this paper, our goal is to address the fundamental question about its sample complexity. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model. We show that model-based MARL achieves a sample complexity of

\tilde O(|S||A||B|(1-\gamma)^{-3}\epsilon^{-2})

for finding the Nash equilibrium (NE) value up to some

\epsilon

error, and the

\epsilon

-NE policies with a smooth planning oracle, where

\gamma

is the discount factor, and

S,A,B

denote the state space, and the action spaces for the two agents. We further show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge, by establishing a matching lower bound. This is in contrast to the usual reward-aware setting, with a

\tilde\Omega(|S|(|A|+|B|)(1-\gamma)^{-3}\epsilon^{-2})

lower bound, where this model-based approach is near-optimal with only a gap on the

|A|,|B|

dependence. Our results not only demonstrate the sample-efficiency of this basic model-based approach in MARL, but also elaborate on the fundamental tradeoff between its power (easily handling the more challenging reward-agnostic case) and limitation (less adaptive and suboptimal in

|A|,|B|

), particularly arises in the multi-agent context

arXiv.org e-Print Archive

Effective action for Einstein-Maxwell theory at order RF**4

Author: Bastianelli F
Bastianelli F
Bastianelli F
Başar G Dunne G V
Benincasa P
Bern Z
Cachazo F Svrcek P
Christian Schubert
Dunne G V
Heisenberg W Euler H
José Manuel Dávila
Weisskopf V
Publication venue: 'IOP Publishing'
Publication date: 11/12/2009
Field of study

We use a recently derived integral representation of the one-loop effective action in Einstein-Maxwell theory for an explicit calculation of the part of the effective action containing the information on the low energy limit of the five-point amplitudes involving one graviton, four photons and either a scalar or spinor loop. All available identities are used to get the result into a relatively compact form.Comment: 13 pages, no figure

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Multi-Layer Cyber-Physical Security and Resilience for Smart Grid

Author: CH Hauser
DH Lorenz
Drew Fudenberg
E Santacana
F Kuipers
FF Wu
G Xue
GN Ericsson
GN Ericsson
J Casey
K Tomsovic
Mohammad Hossein Manshaei
MS Amin
P McDaniel
P Mieghem Van
Q Zhu
R Hou
S Greengard
S Rass
T Başar
Publication venue
Publication date: 29/09/2018
Field of study

The smart grid is a large-scale complex system that integrates communication technologies with the physical layer operation of the energy systems. Security and resilience mechanisms by design are important to provide guarantee operations for the system. This chapter provides a layered perspective of the smart grid security and discusses game and decision theory as a tool to model the interactions among system components and the interaction between attackers and the system. We discuss game-theoretic applications and challenges in the design of cross-layer robust and resilient controller, secure network routing protocol at the data communication and networking layers, and the challenges of the information security at the management layer of the grid. The chapter will discuss the future directions of using game-theoretic tools in addressing multi-layer security issues in the smart grid.Comment: 16 page

arXiv.org e-Print Archive

Crossref

Autonomous Robust Skill Generation Using Reinforcement Learning with Plant Variation

Author: Asada M.
Başar T.
Bertsekas D. P.
Kei Senda
Miyazaki F.
Senda K.
Skaar S. B.
Sutton R. S.
Yurika Tani
Publication venue: SAGE Publishing
Publication date: 01/04/2014
Field of study

This paper discusses an autonomous space robot for a truss structure assembly using some reinforcement learning. It is difficult for a space robot to complete contact tasks within a real environment, for example, a peg-in-hole task, because of error between the real environment and the controller model. In order to solve problems, we propose an autonomous space robot able to obtain proficient and robust skills by overcoming error to complete a task. The proposed approach develops skills by reinforcement learning that considers plant variation, that is, modeling error. Numerical simulations and experiments show the proposed method is useful in real environments

Crossref

Directory of Open Access Journals

Kyoto University Research Information Repository

A Gauge-Gravity Relation in the One-loop Effective Action

Author: Adamchik V S
Avramidi I G
Avramidi I G
Badger S
Barnes E W
Bastianelli F
Bern Z
Dixon L J
Dunne G V
Dunne G V
Dunne G V
Dunne G V
Elizalde E
Gerald V Dunne
Gökçe Başar
Itzykson C
Kuzenko S M
Kuzenko S M
Kuzenko S M
Simon B
Tseytlin A A
Vigneras M-F
Publication venue: 'IOP Publishing'
Publication date: 07/12/2009
Field of study

We identify an unusual new gauge-gravity relation: the one-loop effective action for a massive spinor in 2n dimensional AdS space is expressed in terms of precisely the same function [a certain multiple gamma function] as the one-loop effective action for a massive charged scalar in 4n dimensions in a maximally symmetric background electromagnetic field [one for which the eigenvalues of F_{\mu\nu} are maximally degenerate, corresponding in 4 dimensions to a self-dual field, equivalently to a field of definite helicity], subject to the identification F^2 \Lambda, where \Lambda is the gravitational curvature. Since these effective actions generate the low energy limit of all one-loop multi-leg graviton or gauge amplitudes, this implies a nontrivial gauge-gravity relation at the non-perturbative level and at the amplitude level.Comment: 6 page

arXiv.org e-Print Archive

Crossref

Almost convergence and generalized weighted mean II

Author: A Sönmez
A Sönmez
A Wilansky
AM Jarrah
B Altay
B Altay
B Altay
B Altay
B Altay
E Malkowsky
E Malkowsky
E Öztürk
F Başar
F Başar
F Başar
F Başar
F Başar
GG Lorentz
HI Miller
J Boos
JA Siddiqi
JP Duran
JP King
K Kayaduman
M Candan
M Candan
M Kirişçi
M Kirişçi
M Kirişçi
M Mursaleen
M Mursaleen
M Mursaleen
Murat Kirişci
N Şimsek
P-N Ng
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

On the Deformation of a Hyperelastic Tube Due to Steady Viscous Flow Within

Author: A Lazopoulos
A Lazopoulos
A. I. Lurie
AH Shapiro
AK Abramian
AV Porubov
CL Dym
DA Indeitsev
F Gay-Balmaz
H Kraus
H Moon
IS Liu
J Bonet
JB Grotberg
LA Mihai
M Heil
MA Hussain
MK Raj
N. I. Muskhelishvili
NJ Wagner
RB Bird
RJ Whittaker
RL Bisplinghoff
Ronald L. Panton
RW Ogden
S Chien
S Čanić
SB Elbaz
SB Elbaz
Vishal Anand
Wilhelm Flügge
Yavuz Başar
Yuan-Cheng Fung
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/01/2019
Field of study

In this chapter, we analyze the steady-state microscale fluid--structure interaction (FSI) between a generalized Newtonian fluid and a hyperelastic tube. Physiological flows, especially in hemodynamics, serve as primary examples of such FSI phenomena. The small scale of the physical system renders the flow field, under the power-law rheological model, amenable to a closed-form solution using the lubrication approximation. On the other hand, negligible shear stresses on the walls of a long vessel allow the structure to be treated as a pressure vessel. The constitutive equation for the microtube is prescribed via the strain energy functional for an incompressible, isotropic Mooney--Rivlin material. We employ both the thin- and thick-walled formulations of the pressure vessel theory, and derive the static relation between the pressure load and the deformation of the structure. We harness the latter to determine the flow rate--pressure drop relationship for non-Newtonian flow in thin- and thick-walled soft hyperelastic microtubes. Through illustrative examples, we discuss how a hyperelastic tube supports the same pressure load as a linearly elastic tube with smaller deformation, thus requiring a higher pressure drop across itself to maintain a fixed flow rate.Comment: 19 pages, 3 figures, Springer book class; v2: minor revisions, final form of invited contribution to the Springer volume entitled "Dynamical Processes in Generalized Continua and Structures" (in honour of Academician D.I. Indeitsev), eds. H. Altenbach, A. Belyaev, V. A. Eremeyev, A. Krivtsov and A. V. Porubo

arXiv.org e-Print Archive

Crossref

Inhomogeneous Condensates in the Thermodynamics of the Chiral NJL_2 model

Author: D. F. Lawden
F. Gesztesy
Gerald V. Dunne
Gökçe Başar
J. Feinberg
J. L. Kneur
K. Langfeld
K. Rajagopal
L. A. Dickey
Michael Thies
P. de Forcrand
P. G. de Gennes
R. Peierls
V. Schön
Publication venue: 'American Physical Society (APS)'
Publication date: 10/03/2009
Field of study

We analyze the thermodynamical properties, at finite density and nonzero temperature, of the (1+1)-dimensional chiral Gross-Neveu model (the NJL_2 model), using the exact inhomogeneous (crystalline) condensate solutions to the gap equation. The continuous chiral symmetry of the model plays a crucial role, and the thermodynamics leads to a broken phase with a periodic spiral condensate, the "chiral spiral", as a thermodynamically preferred limit of the more general "twisted kink crystal" solution of the gap equation. This situation should be contrasted with the Gross-Neveu model, which has a discrete chiral symmetry, and for which the phase diagram has a crystalline phase with a periodic kink crystal. We use a combination of analytic, numerical and Ginzburg-Landau techniques to study various parts of the phase diagram.Comment: 28 pages, 13 figure

arXiv.org e-Print Archive

Crossref

Four-dimensional generalized difference matrix and some double sequence spaces

Author: B Altay
C Çakan
CR Adams
F Başar
F Başar
F Móricz
HJ Hamilton
J Boos
M Mursaleen
M Mursaleen
M Mursaleen
M Mursaleen
M Yeşilkayagil
M Yeşilkayagil
M Yeşilkayagil
M Yeşilkayagil
M Yeşilkayagil
M Zeltser
M Zeltser
M Zeltser
Orhan Tuǧ
RC Cooke
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In this study, I introduce some new double sequence spaces B(Mu), B(Cp), B(Cbp), B(Cr) and B(Lq) as the domain of four-dimensional generalized difference matrix B(r,s,t,u) in the spaces Mu, Cp, Cbp, Cr and Lq, respectively. I show that the double sequence spaces B(Mu), B(Cbp) and B(Cr) are the Banach spaces under some certain conditions. I give some inclusion relations with some topological properties. Moreover, I determine the α-dual of the spaces B(Mu) and B(Cbp), the β(ϑ)-duals of the spaces B(Mu), B(Cp), B(Cbp), B(Cr) and B(Lq), where ϑ∈{p,bp,r}, and the γ-dual of the spaces B(Mu), B(Cbp) and B(Lq). Finally, I characterize the classes of four-dimensional matrix mappings defined on the spaces B(Mu), B(Cp), B(Cbp), B(Cr) and B(Lq) of double sequences

Tishk International University Repository

Crossref

Directory of Open Access Journals