21,403 research outputs found
Reinforcement Learning in Traffic Control for Connected Automated Vehicles
The last years, more and people are concentrating in big cities for reasons of living and working. This effect has already some negative impacts on transportation networks including congestion and inefficiency. Parallel to the centralization, the number of autonomous vehicles on roads is continuing to grow, without completely replacing human driving vehicles. The upcoming mixed autonomy traffic situations will bring more dangers in terms of safety and transportation efficiency. The traditional traffic management solutions may not be able to handle these situations. Machine learning approaches have been already proved efficient in various complex fields. In this dissertation, a sub-field of Machine Learning, the Deep Reinforcement Learning will be investigated for enabling a smooth coexistence of automated, connected, and conventional vehicles. In particular, various reinforcement learning models, with both single and multi agent approaches, will be trained and tested on controlling the traffic flow in a specific mixed autonomy traffic scenario, where a transition from autonomous to human driving mode is needed for
the vehicles
Graph Reinforcement Learning Application to Co-operative Decision-Making in Mixed Autonomy Traffic: Framework, Survey, and Challenges
Proper functioning of connected and automated vehicles (CAVs) is crucial for
the safety and efficiency of future intelligent transport systems. Meanwhile,
transitioning to fully autonomous driving requires a long period of mixed
autonomy traffic, including both CAVs and human-driven vehicles. Thus,
collaboration decision-making for CAVs is essential to generate appropriate
driving behaviors to enhance the safety and efficiency of mixed autonomy
traffic. In recent years, deep reinforcement learning (DRL) has been widely
used in solving decision-making problems. However, the existing DRL-based
methods have been mainly focused on solving the decision-making of a single
CAV. Using the existing DRL-based methods in mixed autonomy traffic cannot
accurately represent the mutual effects of vehicles and model dynamic traffic
environments. To address these shortcomings, this article proposes a graph
reinforcement learning (GRL) approach for multi-agent decision-making of CAVs
in mixed autonomy traffic. First, a generic and modular GRL framework is
designed. Then, a systematic review of DRL and GRL methods is presented,
focusing on the problems addressed in recent research. Moreover, a comparative
study on different GRL methods is further proposed based on the designed
framework to verify the effectiveness of GRL methods. Results show that the GRL
methods can well optimize the performance of multi-agent decision-making for
CAVs in mixed autonomy traffic compared to the DRL methods. Finally, challenges
and future research directions are summarized. This study can provide a
valuable research reference for solving the multi-agent decision-making
problems of CAVs in mixed autonomy traffic and can promote the implementation
of GRL-based methods into intelligent transportation systems. The source code
of our work can be found at https://github.com/Jacklinkk/Graph_CAVs.Comment: 22 pages, 7 figures, 10 tables. Currently under review at IEEE
Transactions on Intelligent Transportation System
Traffic Light Control Using Deep Policy-Gradient and Value-Function Based Reinforcement Learning
Recent advances in combining deep neural network architectures with
reinforcement learning techniques have shown promising potential results in
solving complex control problems with high dimensional state and action spaces.
Inspired by these successes, in this paper, we build two kinds of reinforcement
learning algorithms: deep policy-gradient and value-function based agents which
can predict the best possible traffic signal for a traffic intersection. At
each time step, these adaptive traffic light control agents receive a snapshot
of the current state of a graphical traffic simulator and produce control
signals. The policy-gradient based agent maps its observation directly to the
control signal, however the value-function based agent first estimates values
for all legal control signals. The agent then selects the optimal control
action with the highest value. Our methods show promising results in a traffic
network simulated in the SUMO traffic simulator, without suffering from
instability issues during the training process
An Agent-based Modelling Framework for Driving Policy Learning in Connected and Autonomous Vehicles
Due to the complexity of the natural world, a programmer cannot foresee all
possible situations, a connected and autonomous vehicle (CAV) will face during
its operation, and hence, CAVs will need to learn to make decisions
autonomously. Due to the sensing of its surroundings and information exchanged
with other vehicles and road infrastructure, a CAV will have access to large
amounts of useful data. While different control algorithms have been proposed
for CAVs, the benefits brought about by connectedness of autonomous vehicles to
other vehicles and to the infrastructure, and its implications on policy
learning has not been investigated in literature. This paper investigates a
data driven driving policy learning framework through an agent-based modelling
approaches. The contributions of the paper are two-fold. A dynamic programming
framework is proposed for in-vehicle policy learning with and without
connectivity to neighboring vehicles. The simulation results indicate that
while a CAV can learn to make autonomous decisions, vehicle-to-vehicle (V2V)
communication of information improves this capability. Furthermore, to overcome
the limitations of sensing in a CAV, the paper proposes a novel concept for
infrastructure-led policy learning and communication with autonomous vehicles.
In infrastructure-led policy learning, road-side infrastructure senses and
captures successful vehicle maneuvers and learns an optimal policy from those
temporal sequences, and when a vehicle approaches the road-side unit, the
policy is communicated to the CAV. Deep-imitation learning methodology is
proposed to develop such an infrastructure-led policy learning framework
- …