Search CORE

4 research outputs found

Distributed Recharging Rate Control for Energy Demand Management of Electric Vehicles

Author: Barria JA
Hamid QR
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2013
Field of study

Spiral - Imperial College Digital Repository

Distributed dynamic reinforcement of efficient outcomes in multiagent coordination and network formation

Author: Georgios C. Chasparis
Jeff S. Shamma
Publication venue
Publication date: 01/01/2012
Field of study

We analyze reinforcement learning under so-called “dynamic reinforcement”. In reinforcement learning, each agentrepeatedly interacts with an unknown environment (i.e., other agents), receives a reward, and updates the probabilities of its next action based on its own previous actions and received rewards. Unlike standard reinforcement learning, dynamic reinforcement uses a combination of long term rewards and recent rewards to construct myopically forward looking action selection probabilities. We analyze the long term stability of the learning dynamics for general games with pure strategy Nash equilibria and specialize the results for coordination games and distributed network formation. In this class of problems, more than one stable equilibrium (i.e., coordination configuration) may exist. We demonstrate equilibrium selection under dynamic reinforcement. In particular, we show how a single agent is able to destabilize an equilibrium in favor of another by appropriately adjusting its dynamic reinforcement parameters. We contrast the conclusions with prior game theoretic results according to which the risk dominant equilibrium is the only robust equilibrium when agents ’ decisions are subject to small randomized perturbations. The analysis throughout is based on the ODE method for stochastic approximations, where a special form of perturbation in the learning dynamics allows for analyzing its behavior at the boundary points of the state space

CiteSeerX

Lund University Publications

Learning Near-Pareto-Optimal Conventions in Polynomial Time

Author: Tuomas Sandholm
Xiaofeng Wang
Publication venue
Publication date: 01/01/2003
Field of study

We study how to learn to play a Pareto-optimal strict Nash equilibrium when there exist multiple equilibria and agents may have different preferences among the equilibria. We focus on repeated coordination games of non-identical interest where agents do not know the game structure up front and receive noisy payoffs. We design efficient near-optimal algorithms for both the perfect monitoring and the imperfect monitoring setting(where the agents only observe their own payoffs and the joint actions)

CiteSeerX