6,633 research outputs found
Learning Dynamics and Reinforcement in Stochastic Games
The theory of Reinforcement Learning provides learning algorithms that are guaranteed
to converge to optimal behavior in single-agent learning environments. While these algorithms often do not scale well to large problems without modification, a vast amount of
recent research has combined them with function approximators with remarkable success
in a diverse range of large-scale and complex problems. Motivated by this success in
single-agent learning environments, the first half of this work aims to study convergent
learning algorithms in multi-agent environments. The theory of multi-agent learning is
itself a rich subject, however classically it has confined itself to learning in iterated games
where there are no state dynamics. In contrast, this work examines learning in stochastic
games, where agents play one another in a temporally extended game that has nontrivial
state dynamics. We do so by first defining two classes of stochastic games: Stochastic
Potential Games (SPGs) and Global Stochastic Potential Games (GSPGs). We show that
both games admit pure Nash equilibria, as well as further refinements of their equilibrium
sets. We discuss possible applications of these games in the context of congestion and
traffic routing scenarios. Finally, we define learning algorithms that
1. converge to pure Nash equilibria and
2. converge to further refinements of Nash equilibria.
In the final chapter we combine a simple type of multi-agent learning - individual
Q-learning - with neural networks in order to solve a large scale vehicle routing and
assignment problem. Individual Q-learning is a heuristic learning algorithm that, even in
small multi-agent problems, does not provide convergence guarantees. Nonetheless, we
observe good performance of this algorithm in this setting.PHDMathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155158/1/johnholl_1.pd
Boltzmann meets Nash: Energy-efficient routing in optical networks under uncertainty
Motivated by the massive deployment of power-hungry data centers for service
provisioning, we examine the problem of routing in optical networks with the
aim of minimizing traffic-driven power consumption. To tackle this issue,
routing must take into account energy efficiency as well as capacity
considerations; moreover, in rapidly-varying network environments, this must be
accomplished in a real-time, distributed manner that remains robust in the
presence of random disturbances and noise. In view of this, we derive a pricing
scheme whose Nash equilibria coincide with the network's socially optimum
states, and we propose a distributed learning method based on the Boltzmann
distribution of statistical mechanics. Using tools from stochastic calculus, we
show that the resulting Boltzmann routing scheme exhibits remarkable
convergence properties under uncertainty: specifically, the long-term average
of the network's power consumption converges within of its
minimum value in time which is at most ,
irrespective of the fluctuations' magnitude; additionally, if the network
admits a strict, non-mixing optimum state, the algorithm converges to it -
again, no matter the noise level. Our analysis is supplemented by extensive
numerical simulations which show that Boltzmann routing can lead to a
significant decrease in power consumption over basic, shortest-path routing
schemes in realistic network conditions.Comment: 24 pages, 4 figure
Multi-Layer Cyber-Physical Security and Resilience for Smart Grid
The smart grid is a large-scale complex system that integrates communication
technologies with the physical layer operation of the energy systems. Security
and resilience mechanisms by design are important to provide guarantee
operations for the system. This chapter provides a layered perspective of the
smart grid security and discusses game and decision theory as a tool to model
the interactions among system components and the interaction between attackers
and the system. We discuss game-theoretic applications and challenges in the
design of cross-layer robust and resilient controller, secure network routing
protocol at the data communication and networking layers, and the challenges of
the information security at the management layer of the grid. The chapter will
discuss the future directions of using game-theoretic tools in addressing
multi-layer security issues in the smart grid.Comment: 16 page
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
- …