78 research outputs found
Deep Q-learning: a robust control approach
This work aims at constructing a bridge between robust control theory and reinforcement learning. Although, reinforcement learning has shown admirable results in complex control tasks, the agent’s learning behaviour is opaque. Meanwhile, system theory has several tools for analyzing and controlling dynamical systems. This paper places deep Q-learning is into a control-oriented perspective to study its learning dynamics with well-established techniques from robust control. An uncertain linear time-invariant model is formulated by means of the neural tangent kernel to describe learning. This novel approach allows giving conditions for stability (convergence) of the learning and enables the analysis of the agent’s behaviour in frequency-domain. The control-oriented approach makes it possible to formulate robust controllers that inject dynamical rewards as control input in the loss function to achieve better convergence properties. Three output-feedback controllers are synthesized: gain scheduling H2, dynamical Hinf, and fixed-structure Hinf controllers. Compared to traditional deep Q-learning techniques, which involve several heuristics, setting up the learning agent with a control-oriented tuning methodology is more transparent and has well-established literature. The proposed approach does not use a target network and randomized replay memory. The role of the target network is overtaken by the control input, which also exploits the temporal dependency of samples (opposed to a randomized memory buffer). Numerical simulations in different OpenAI Gym environments suggest that the Hinf controlled learning can converge faster and receive higher scores (depending on the environment) compared to the benchmark Double deep Q-learning
Public transport trajectory planning with probabilistic guarantees
The paper proposes an eco-cruise control strategy for urban public transportbuses. The aim of the velocity control is ensuring timetable adherence, whileconsidering upstream queue lengths at traffic lights in a probabilistic way. Thecontribution of the paper is twofold. First, the shockwave profile model (SPM)is extended to capture the stochastic nature of traffic queue lengths. The modelis adequate to describe frequent traffic state interruptions at signalized intersections.Based on the distribution function of stochastic traffic volume demand,the randomness in queue length, wave fronts, and vehicle numbers are derived.Then, an outlook is provided on its applicability as a full-scale urban traffic networkmodel. Second, a shrinking horizon model predictive controller (MPC) isproposed for ensuring timetable reliability. The intention is to calculate optimalvelocity commands based on the current position and desired arrival time of thebus while considering upcoming delays due to red signals and eventual queues.The above proposed stochastic traffic model is incorporated in a rolling horizonoptimization via chance-constraining. In the optimization, probabilistic guaranteesare formulated to minimize delay due to standstill in queues at signalized intersections. Optimization results are analyzed from two particular aspects, (i)feasibility and (ii) closed-loop performance point of views. The novel stochasticprofile model is tested in a high fidelity traffic simulator context. Comparativesimulation results show the viability and importance of stochastic bounds in urbantrajectory design. The proposed algorithm yields smoother bus trajectoriesat an urban corridor, suggesting energy savings compared to benchmark controlstrategies
Short-term traffic prediction using physics-aware neural networks
In this work, we propose an algorithm performing short-termpredictions of the flow and speed of vehicles on a stretch of road, using past measurements of these quantities. This algorithm is based on a physics-aware recurrent neural network. Adiscretization of a macroscopic traffic flowmodel (using the so-called Traffic Reaction Model) is embedded in the architecture of the network and yields traffic state estimations and predictions for the flow and speed of vehicles, which are physically-constrained by the macroscopic traffic flow model and based on estimated and predicted space-time dependent traffic parameters. These parameters are themselves obtained using a succession of LSTM recurrent neural networks. The algorithm is tested on raw flow measurements obtained from loop detectors
A Modular, Adaptive, and Autonomous Transit System (MAATS): A In-motion Transfer Strategy and Performance Evaluation in Urban Grid Transit Networks
Dynamic traffic demand has been a longstanding challenge for the conventional transit system design and operation. The recent development of autonomous vehicles (AVs) makes it increasingly realistic to develop the next generation of transportation systems with the potential to improve operational performance and flexibility. In this study, we propose an innovative transit system with autonomous modular buses (AMBs) that is adaptive to dynamic traffic demands and not restricted to fixed routes and timetables. A unique transfer operation, termed as “in-motion transfer”, is introduced in this paper to transfer passengers between coupled modular buses in motion. A two-stage model is developed to facilitate in-motion transfer operations in optimally designing passenger transfer plans and AMB trajectories at intersections. In the proposed AMB system, all passengers can travel in the shortest path smoothly without having to actually alight and transfer between different bus lines. Numerical experiments demonstrate that the proposed transit system results in shorter travel time and a significantly reduced average number of transfers. While enjoying the above-mentioned benefits, the modular, adaptive, and autonomous transit system (MAATS) does not impose substantially higher energy consumption in comparison to the conventional bus syste
Emergency vehicle lane pre-clearing: From microscopic cooperation to routing decision making
Emergency vehicles (EVs) play a crucial role in providing timely help for the general public in saving lives and avoiding property loss. However, very few efforts have been made for EV prioritization on normal road segments, such as the road section between intersections or highways between ramps. In this paper, we propose an EV lane pre-clearing strategy to prioritize EVs on such roads through cooperative driving with surrounding connected vehicles (CVs). The cooperative driving problem is formulated as a mixed-integer nonlinear programming (MINP) problem aiming at (i) guaranteeing the desired speed of EVs, and (ii) minimizing the disturbances on CVs. To tackle this NP-hard MINP problem, we formulate the model in a bi-level optimization manner to address these two objectives, respectively. In the lower-level problem, CVs in front of the emergency vehicle will be divided into several blocks. For each block, we developed an EV sorting algorithm to design optimal merging trajectories for CVs. With resultant sorting trajectories, a constrained optimization problem is solved in the upper-level to determine the initiation time/distance to conduct the sorting trajectories. Case studies show that with the proposed algorithm, emergency vehicles are able to drive at a desired speed while minimizing disturbances on normal traffic flows. We further reveal a linear relationship between the optimal solution and road density, which could help to improve EV routing decision makings when high-resolution data is not available
Freeway Traffic Jam Mitigation via Connected Automated Vehicles
We consider the problem of altruistic control of connected automated vehicles (CAVs) on multi-lane highways to mitigate phantom traffic jams resulting from car-following dynamics of human-driven vehicles (HDVs). In most of the existing studies on CAVs in multi-lane settings, vehicle controller design philosophy is based on a selfish driving strategy that exclusively addresses the ego vehicle objectives. To improve overall traffic smoothness, we propose an altruistic control strategy for CAVs that aims to maximize the driving comfort and traffic efficiency of both the ego vehicle and surrounding HDVs. We formulate the problem of altruistic control under a model predictive control (MPC) framework to optimize acceleration and lane change sequences of CAVs. Simulation results demonstrate significant improvements in traffic flow via altruistic CAV actions over selfish strategies
Hierarchical Control of Electric Bus Lines
In this paper, we propose a hierarchical control strategy for a line of electric buses with the double objective of minimizing energy consumption and providing regular service to the passengers. The state-space model for the buses is formulated in space rather than in time, which alleviates the need for integer decision variables to capture their behavior at bus stops. This enables us to first assemble a fully-centralized multi-objective line problem in the continuous nonlinear optimization framework. It is then reassembled into a hierarchical structure with two levels of control in order to improve on scalability and reliability. This new supervisory structure consists of a centralized line level controller which handles the time headway regularity of the buses, and of decentralized bus level controllers which simultaneously manage the energy consumption of each individual bus. Our method demonstrates good battery energy savings and regularity performances when compared to a classical holding strategy
Distributed eco-driving control of a platoon of electric vehicles through Riccati recursion
This paper presents a distributed optimization procedure for the cooperative eco-driving control problem of a platoon of electric vehiclessubject to safety and travel time constraints. Individual optimal trajectories are generated for each platoon member to account for heterogeneous vehicles and for the road slope. By rearranging the problem variables, the Riccati recursion can be applied along the chain-like structure of the platoon and be used to solve the problem by repeatedly transmitting information up and down the platoon. Since each vehicle is only responsible for its own part of the computations, the proposed control strategy is privacy-preserving and could therefore be deployed by any group of vehicles to form a platoon spontaneously while driving. The energy efficiency of this control strategy is evaluated in numerical experiments for platoons of electric trucks with different masses and rated motor powers
Altruistic Control of Connected Automated Vehicles in Mixed-Autonomy Multi-Lane Highway Traffic
We consider the problem of altruistic control of connected automated vehicles (CAVs) on mixed-autonomy multi-lane highways to mitigate moving traffic jams resulting from car-following dynamics of human-driven vehicles (HDVs). In most of the existing studies on CAVs in multi-lane settings, vehicle controller design philosophy is based on a selfish driving strategy that exclusively addresses the ego vehicle objectives. To improve overall traffic smoothness, we propose an altruistic control strategy for CAVs that aims to maximize the driving comfort and traffic efficiency of both the ego vehicle and surrounding HDVs. We formulate the problem of altruistic control under a model predictive control (MPC) framework to optimize acceleration and lane change sequences of CAVs. In order to efficiently solve the resulting non-convex mixed-integer nonlinear programming (MINLP) problem, we decompose it into three non-convex subproblems, each of which can be transformed into a convex quadratic program via penalty based reformulation of the optimal velocity with relative velocity (OVRV) car-following model. Simulation results demonstrate significant improvements in traffic flow via altruistic CAV actions over selfish strategies on both single- and multi-lane roads
Dynamic Stochastic Electric Vehicle Routing with Safe Reinforcement Learning
Dynamic routing of electric commercial vehicles can be a challenging problem since besides the uncertainty of energy consumption there are also random customer requests. This paper introduces the Dynamic Stochastic Electric Vehicle Routing Problem (DS-EVRP). A Safe Reinforcement Learning method is proposed for solving the problem. The objective is to minimize expected energy consumption in a safe way, which means also minimizing the risk of battery depletion while en route by planning charging whenever necessary. The key idea is to learn offline about the stochastic customer requests and energy consumption using Monte Carlo simulations, to be able to plan the route predictively and safely online. The method is evaluated using simulations based on energy consumption data from a realistic traffic model for the city of Luxembourg and a high-fidelity vehicle model. The results indicate that it is possible to save energy at the same time maintaining reliability by planning the routes and charging in an anticipative way. The proposed method has the potential to improve transport operations with electric commercial vehicles capitalizing on their environmental benefit
- …