Search CORE

6,633 research outputs found

Learning Dynamics and Reinforcement in Stochastic Games

Author: Holler John
Publication venue
Publication date: 01/01/2020
Field of study

The theory of Reinforcement Learning provides learning algorithms that are guaranteed to converge to optimal behavior in single-agent learning environments. While these algorithms often do not scale well to large problems without modification, a vast amount of recent research has combined them with function approximators with remarkable success in a diverse range of large-scale and complex problems. Motivated by this success in single-agent learning environments, the first half of this work aims to study convergent learning algorithms in multi-agent environments. The theory of multi-agent learning is itself a rich subject, however classically it has confined itself to learning in iterated games where there are no state dynamics. In contrast, this work examines learning in stochastic games, where agents play one another in a temporally extended game that has nontrivial state dynamics. We do so by first defining two classes of stochastic games: Stochastic Potential Games (SPGs) and Global Stochastic Potential Games (GSPGs). We show that both games admit pure Nash equilibria, as well as further refinements of their equilibrium sets. We discuss possible applications of these games in the context of congestion and traffic routing scenarios. Finally, we define learning algorithms that 1. converge to pure Nash equilibria and 2. converge to further refinements of Nash equilibria. In the final chapter we combine a simple type of multi-agent learning - individual Q-learning - with neural networks in order to solve a large scale vehicle routing and assignment problem. Individual Q-learning is a heuristic learning algorithm that, even in small multi-agent problems, does not provide convergence guarantees. Nonetheless, we observe good performance of this algorithm in this setting.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155158/1/johnholl_1.pd

Deep Blue Documents at the University of Michigan

Boltzmann meets Nash: Energy-efficient routing in optical networks under uncertainty

Author: Mertikopoulos Panayotis
Moustakas Aris L.
Tzanakaki Anna
Publication venue
Publication date: 01/01/2016
Field of study

Motivated by the massive deployment of power-hungry data centers for service provisioning, we examine the problem of routing in optical networks with the aim of minimizing traffic-driven power consumption. To tackle this issue, routing must take into account energy efficiency as well as capacity considerations; moreover, in rapidly-varying network environments, this must be accomplished in a real-time, distributed manner that remains robust in the presence of random disturbances and noise. In view of this, we derive a pricing scheme whose Nash equilibria coincide with the network's socially optimum states, and we propose a distributed learning method based on the Boltzmann distribution of statistical mechanics. Using tools from stochastic calculus, we show that the resulting Boltzmann routing scheme exhibits remarkable convergence properties under uncertainty: specifically, the long-term average of the network's power consumption converges within

\varepsilon

of its minimum value in time which is at most

\tilde O(1/\varepsilon^2)

, irrespective of the fluctuations' magnitude; additionally, if the network admits a strict, non-mixing optimum state, the algorithm converges to it - again, no matter the noise level. Our analysis is supplemented by extensive numerical simulations which show that Boltzmann routing can lead to a significant decrease in power consumption over basic, shortest-path routing schemes in realistic network conditions.Comment: 24 pages, 4 figure

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Multi-Layer Cyber-Physical Security and Resilience for Smart Grid

Author: CH Hauser
DH Lorenz
Drew Fudenberg
E Santacana
F Kuipers
FF Wu
G Xue
GN Ericsson
GN Ericsson
J Casey
K Tomsovic
Mohammad Hossein Manshaei
MS Amin
P McDaniel
P Mieghem Van
Q Zhu
R Hou
S Greengard
S Rass
T Başar
Publication venue
Publication date: 29/09/2018
Field of study

The smart grid is a large-scale complex system that integrates communication technologies with the physical layer operation of the energy systems. Security and resilience mechanisms by design are important to provide guarantee operations for the system. This chapter provides a layered perspective of the smart grid security and discusses game and decision theory as a tool to model the interactions among system components and the interaction between attackers and the system. We discuss game-theoretic applications and challenges in the design of cross-layer robust and resilient controller, secure network routing protocol at the data communication and networking layers, and the challenges of the information security at the management layer of the grid. The chapter will discuss the future directions of using game-theoretic tools in addressing multi-layer security issues in the smart grid.Comment: 16 page

arXiv.org e-Print Archive

Crossref

Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

Author: Alsheikh Mohammad Abu
Hoang Dinh Thai
Lin Shaowei
Niyato Dusit
Tan Hwee-Pink
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 04/01/2015
Field of study

Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs

arXiv.org e-Print Archive

University of Canberra Research Repository