Search CORE

10,257 research outputs found

Mean-Field-Type Games in Engineering

Author: Djehiche Boualem
Tcheukam Alain
Tembine Hamidou
Publication venue
Publication date: 01/11/2017
Field of study

A mean-field-type game is a game in which the instantaneous payoffs and/or the state dynamics functions involve not only the state and the action profile but also the joint distributions of state-action pairs. This article presents some engineering applications of mean-field-type games including road traffic networks, multi-level building evacuation, millimeter wave wireless communications, distributed power networks, virus spread over networks, virtual machine resource management in cloud networks, synchronization of oscillators, energy-efficient buildings, online meeting and mobile crowdsensing.Comment: 84 pages, 24 figures, 183 references. to appear in AIMS 201

arXiv.org e-Print Archive

Directory of Open Access Journals

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

A stochastic approximation algorithm for stochastic semidefinite programming

Author: Gaujal Bruno
Mertikopoulos Panayotis
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 07/07/2015
Field of study

Motivated by applications to multi-antenna wireless networks, we propose a distributed and asynchronous algorithm for stochastic semidefinite programming. This algorithm is a stochastic approximation of a continous- time matrix exponential scheme regularized by the addition of an entropy-like term to the problem's objective function. We show that the resulting algorithm converges almost surely to an

\varepsilon

-approximation of the optimal solution requiring only an unbiased estimate of the gradient of the problem's stochastic objective. When applied to throughput maximization in wireless multiple-input and multiple-output (MIMO) systems, the proposed algorithm retains its convergence properties under a wide array of mobility impediments such as user update asynchronicities, random delays and/or ergodically changing channels. Our theoretical analysis is complemented by extensive numerical simulations which illustrate the robustness and scalability of the proposed method in realistic network conditions.Comment: 25 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Slow Learners are Fast

Author: Langford John
Smola Alexander
Zinkevich Martin
Publication venue
Publication date: 01/01/2009
Field of study

Online learning algorithms have impressive convergence properties when it comes to risk minimization and convex games on very large problems. However, they are inherently sequential in their design which prevents them from taking advantage of modern multi-core architectures. In this paper we prove that online learning with delayed updates converges well, thereby facilitating parallel online learning.Comment: Extended version of conference paper - NIPS 200

arXiv.org e-Print Archive

CiteSeerX