611 research outputs found

    Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning

    Full text link
    The majority of current studies on autonomous vehicle control via deep reinforcement learning (DRL) utilize point-mass kinematic models, neglecting vehicle dynamics which includes acceleration delay and acceleration command dynamics. The acceleration delay, which results from sensing and actuation delays, results in delayed execution of the control inputs. The acceleration command dynamics dictates that the actual vehicle acceleration does not rise up to the desired command acceleration instantaneously due to dynamics. In this work, we investigate the feasibility of applying DRL controllers trained using vehicle kinematic models to more realistic driving control with vehicle dynamics. We consider a particular longitudinal car-following control, i.e., Adaptive Cruise Control (ACC), problem solved via DRL using a point-mass kinematic model. When such a controller is applied to car following with vehicle dynamics, we observe significantly degraded car-following performance. Therefore, we redesign the DRL framework to accommodate the acceleration delay and acceleration command dynamics by adding the delayed control inputs and the actual vehicle acceleration to the reinforcement learning environment state, respectively. The training results show that the redesigned DRL controller results in near-optimal control performance of car following with vehicle dynamics considered when compared with dynamic programming solutions.Comment: Accepted to 2019 IEEE Intelligent Transportation Systems Conferenc

    Machine Learning Tools for Optimization of Fuel Consumption at Signalized Intersections in Connected/Automated Vehicles Environment

    Get PDF
    Researchers continue to seek numerous techniques for making the transportation sector more sustainable in terms of fuel consumption and greenhouse gas emissions. Among the most effective techniques is Eco-driving at signalized intersections. Eco-driving is a complex control problem where drivers approaching the intersections are guided, over a period of time, to optimize fuel consumption. Eco-driving control systems reduce fuel consumption by optimizing vehicle trajectories near signalized intersections based on information of the SpaT (Signal Phase and Timing). Developing Eco-driving applications for semi-actuated signals, unlike pre-timed, is more challenging due to variations in cycle length resulting from fluctuations in traffic demand. Reinforcement learning (RL) is a machine learning paradigm that mimics the human learning behavior where an agent attempts to solve a given control problem by interacting with the environment and developing an optimal policy. Unlike the methods implemented in previous studies for solving the Eco-driving problem, RL does not necessitate prior knowledge of the environment being learned and processed. Therefore, the aim of this study is twofold: (1) Develop a novel brute force Eco-driving algorithm (ECO-SEMI-Q) for CAV (Connected/Autonomous Vehicles) passing through semi-actuated signalized intersections; and (2) Develop a novel Deep Reinforcement Learning (DRL) Eco-driving algorithm for CAV passing through fixed-time signalized intersections. The developed algorithms are tested at both microscopic and macroscopic levels. For the microscopic level, results indicate that the fuel consumption for vehicles controlled by the ECO-SEMI-Q and DRL models is 29.2% and 23% less than that for the case with no control, respectively. For the macroscopic level, a sensitivity analysis for the impact of MPR (Market Penetration Rate) shows that the savings in fuel consumption increase with higher MPR. Furthermore, when MPR is greater than 50%, the ECO-SEMI-Q algorithm provides appreciable savings in travel times. The sensitivity analysis indicates savings in the network fuel consumption when the MPR of the DRL algorithm is higher than 35%. At MPR less than 35%, the DRL algorithm has an adverse impact on fuel consumption due to aggressive lane change and passing maneuvers. These reductions in fuel consumption demonstrate the ability of the algorithms to provide more environmentally sustainable signalized intersections

    Hamiltonian-Driven Adaptive Dynamic Programming with Efficient Experience Replay

    Get PDF
    This article presents a novel efficient experience-replay-based adaptive dynamic programming (ADP) for the optimal control problem of a class of nonlinear dynamical systems within the Hamiltonian-driven framework. The quasi-Hamiltonian is presented for the policy evaluation problem with an admissible policy. With the quasi-Hamiltonian, a novel composite critic learning mechanism is developed to combine the instantaneous data with the historical data. In addition, the pseudo-Hamiltonian is defined to deal with the performance optimization problem. Based on the pseudo-Hamiltonian, the conventional Hamilton–Jacobi–Bellman (HJB) equation can be represented in a filtered form, which can be implemented online. Theoretical analysis is investigated in terms of the convergence of the adaptive critic design and the stability of the closed-loop systems, where parameter convergence can be achieved under a weakened excitation condition. Simulation studies are investigated to verify the efficacy of the presented design scheme

    Acceleration control strategy for Battery Electric Vehicle based on Deep Reinforcement Learning in V2V driving

    Get PDF
    The transportation sector is seeing the flourishing of one of the most interesting technologies, autonomous driving (AD). In particular, Cooperative Adaptive Cruise Control (CACC) systems ensure higher levels both of safety and comfort, enhancing at the same time the reduction of energy consumption. In this framework a real-time velocity planner for a Battery Electric Vehicle, based on a Deep Reinforcement Learning algorithm called Deep Deterministic Policy Gradient (DDPG), has been developed, aiming at maximizing energy savings, and improving comfort, thanks to the exchange of information on distance, speed and acceleration through the exploitation of vehicle-to-vehicle technology (V2V). The aforementioned DDPG algorithm relies on a multi-objective reward function that is adaptive to different driving cycles. The simulation results show how the agent can obtain good results on standard cycles, such as WLTP, UDDS and AUDC, and on real-world driving cycles. Moreover, it displays great adaptability to driving cycles different from the training one

    A self-learning intersection control system for connected and automated vehicles

    Get PDF
    This study proposes a Decentralized Sparse Coordination Learning System (DSCLS) based on Deep Reinforcement Learning (DRL) to control intersections under the Connected and Automated Vehicles (CAVs) environment. In this approach, roadway sections are divided into small areas; vehicles try to reserve their desired area ahead of time, based on having a common desired area with other CAVs; the vehicles would be in an independent or coordinated state. Individual CAVs are set accountable for decision-making at each step in both coordinated and independent states. In the training process, CAVs learn to minimize the overall delay at the intersection. Due to the chain impact of taking random actions in the training course, the trained model can deal with unprecedented volume circumstances, the main challenge in intersection management. Application of the model to a single-lane intersection with no turning movement as a proof-of-concept test reveals noticeable improvements in traffic measures compared to three other intersection control systems. A Spring Mass Damper (SMD) model is developed to control platooning behavior of CAVs. In the SMD model, each vehicle is assumed as a mass, coupled with its preceding vehicle with a spring and a damper. The spring constant and damper coefficient control the interaction between vehicles. Limitations on communication range and the number of vehicles in each platoon are applied in this model, and the SMD model controls intra-platoon and inter-platoon interactions. The simulation result for a regular highway section reveals that the proposed platooning algorithm increases the maximum throughput by 29% and 63% under 50% and 100% market penetration rate of CAVs. A merging section with different volume combinations on the main section and merging section and different market penetration rates of CAVs is also modeled to test inter-platoon spacing performance in accommodating merging vehicles. Noticeable travel time reduction is observed in both mainline and merging lanes and under all volume combinations in 80% and higher MPR of CAVs. For a more reliable assessment of the DSCLS, the model is applied to a more realistic intersection, including three approaching lanes in each direction and turning movements. The proposed algorithm decreases delay by 58%, 19%, and 13% in moderate, high, and extreme volume regimes, improving travel time accordingly. Comparison of safety measures reveals 28% improvement in Post Encroachment Time (PET) in the extreme volume regime and minor improvements in high and moderate volume regimes. Due to the limited acceleration and deceleration rates, the proposed model does not show a better performance in environmental measures, including fuel consumption and CO2 emission, compared to the conventional control systems. However, the DSCLS noticeably outperforms the other pixel-reservation counterpart control system, with limited acceleration and deceleration rates. The application of the model to a corridor of four interactions shows the same trends in traffic, safety, and environmental measures as the single intersection experiment. An automated intersection control system for platooning CAVs is developed by combining the two proposed models, which remarkably improves traffic and safety measures, specifically in extreme volume regimes compared to the regular DSCLS model
    • …
    corecore