Distributed reinforcement learning for a traffic engineering application

Abstract

In this paper, we report on novel reinforcement learning tech-niques applied to a real-world application. The problem do-main, a traffic engineering application, is formulated as a distributed reinforcement learning problem, where the re-turns of many agents are simultaneously updating a single shared policy. Learning occurs off-line in a traffic simulator, which allows us to retrieve and exploit good transient poli-cies even in the presence of instabilities in the learning. We introduce two new algorithms developed for this situation, one which is a value function based, and one that employs a direct policy evaluation approach. While the latter is the-oretically better motivated in several ways than the former, we find both perform comparably well in this domain and for the formulation we use.

Similar works

Full text

thumbnail-image

CiteSeerX

redirect
Last time updated on 28/10/2017

This paper was published in CiteSeerX.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.