Search CORE

19,844 research outputs found

Multi-objective Reinforcement Learning

Author: Ruiz-Montiel Manuela
Publication venue
Publication date: 25/09/2013
Field of study

In this talk we present PQ-learning, a new Reinforcement Learning (RL) algorithm that determines the rational behaviours of an agent in multi-objective domainsThis work is partially funded by: grant TIN2009-14179 (Spanish Government, Plan Nacional de I+D+i) and Universidad de Málaga, Campus de Excelencia Internacional Andalucía Tech. Manuela Ruiz-Montiel is funded by the Spanish Ministry of Education through the National F.P.U. Progra

Repositorio Institucional Universidad de Málaga

Using Collective Intelligence to Route Internet Traffic

Author: Frank Jeremy
Tumer Kagan
Wolpert David H.
Publication venue
Publication date: 10/05/1999
Field of study

A COllective INtelligence (COIN) is a set of interacting reinforcement learning (RL) algorithms designed in an automated fashion so that their collective behavior optimizes a global utility function. We summarize the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform all previously investigated RL-based, shortest path routing algorithms.Comment: 7 page

arXiv.org e-Print Archive

CiteSeerX

NASA Technical Reports Server

Difference of Convex Functions Programming Applied to Control with Expert Data

Author: Geist Matthieu
Pietquin Olivier
Piot Bilal
Publication venue
Publication date: 05/09/2016
Field of study

This paper reports applications of Difference of Convex functions (DC) programming to Learning from Demonstrations (LfD) and Reinforcement Learning (RL) with expert data. This is made possible because the norm of the Optimal Bellman Residual (OBR), which is at the heart of many RL and LfD algorithms, is DC. Improvement in performance is demonstrated on two specific algorithms, namely Reward-regularized Classification for Apprenticeship Learning (RCAL) and Reinforcement Learning with Expert Demonstrations (RLED), through experiments on generic Markov Decision Processes (MDP), called Garnets

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

HAL-Rennes 1

Ensemble Kalman Filter (EnKF) for Reinforcement Learning (RL)

Author: Joshi Anant
Mehta Prashant G.
Taghvaei Amirhossein
Publication venue
Publication date: 02/07/2021
Field of study

This paper is concerned with the problem of representing and learning the optimal control law for the linear quadratic Gaussian (LQG) optimal control problem. In recent years, there is a growing interest in re-visiting this classical problem, in part due to the successes of reinforcement learning (RL). The main question of this body of research (and also of our paper) is to approximate the optimal control law {\em without} explicitly solving the Riccati equation. For this purpose, a novel simulation-based algorithm, namely an ensemble Kalman filter (EnKF), is introduced in this paper. The algorithm is used to obtain formulae for optimal control, expressed entirely in terms of the EnKF particles. For the general partially observed LQG problem, the proposed EnKF is combined with a standard EnKF (for the estimation problem) to obtain the optimal control input based on the use of the separation principle. A nonlinear extension of the algorithm is also discussed which clarifies the duality roots of the proposed EnKF. The theoretical results and algorithms are illustrated with numerical experiments

arXiv.org e-Print Archive