4,520 research outputs found

    An MDP decomposition approach for traffic control at isolated signalized intersections

    Get PDF
    This article presents a novel approach for the dynamic control of a signalized intersection. At the intersection, there is a number of arrival flows of cars, each having a single queue (lane). The set of all flows is partitioned into disjoint combinations of nonconflicting flows that will receive green together. The dynamic control of the traffic lights is based on the numbers of cars waiting in the queues. The problem concerning when to switch (and which combination to serve next) is modeled as a Markovian decision process in discrete time. For large intersections (i.e., intersections with a large number of flows), the number of states becomes tremendously large, prohibiting straightforward optimization using value iteration or policy iteration. Starting from an optimal (or nearly optimal) fixed-cycle strategy, a one-step policy improvement is proposed that is easy to compute and is shown to give a close to optimal strategy for the dynamic proble

    Adaptive traffic signal control using approximate dynamic programming

    Get PDF
    This paper presents a study on an adaptive traffic signal controller for real-time operation. The controller aims for three operational objectives: dynamic allocation of green time, automatic adjustment to control parameters, and fast revision of signal plans. The control algorithm is built on approximate dynamic programming (ADP). This approach substantially reduces computational burden by using an approximation to the value function of the dynamic programming and reinforcement learning to update the approximation. We investigate temporal-difference learning and perturbation learning as specific learning techniques for the ADP approach. We find in computer simulation that the ADP controllers achieve substantial reduction in vehicle delays in comparison with optimised fixed-time plans. Our results show that substantial benefits can be gained by increasing the frequency at which the signal plans are revised, which can be achieved conveniently using the ADP approach

    Abstractions of stochastic hybrid systems

    Get PDF
    Many control systems have large, infinite state space that can not be easily abstracted. One method to analyse and verify these systems is reachability analysis. It is frequently used for air traffic control and power plants. Because of lack of complete information about the environment or unpredicted changes, the stochastic approach is a viable alternative. In this paper, different ways of introducing rechability under uncertainty are presented. A new concept of stochastic bisimulation is introduced and its connection with the reachability analysis is established. The work is mainly motivated by safety critical situations in air traffic control (like collision detection and avoidance) and formal tools are based on stochastic analysis

    Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning

    Full text link
    Recent advances in combining deep neural network architectures with reinforcement learning techniques have shown promising potential results in solving complex control problems with high dimensional state and action spaces. Inspired by these successes, in this paper, we build two kinds of reinforcement learning algorithms: deep policy-gradient and value-function based agents which can predict the best possible traffic signal for a traffic intersection. At each time step, these adaptive traffic light control agents receive a snapshot of the current state of a graphical traffic simulator and produce control signals. The policy-gradient based agent maps its observation directly to the control signal, however the value-function based agent first estimates values for all legal control signals. The agent then selects the optimal control action with the highest value. Our methods show promising results in a traffic network simulated in the SUMO traffic simulator, without suffering from instability issues during the training process
    • ā€¦
    corecore