1,108 research outputs found
Decision-making for Autonomous Vehicles on Highway: Deep Reinforcement Learning with Continuous Action Horizon
Decision-making strategy for autonomous vehicles de-scribes a sequence of
driving maneuvers to achieve a certain navigational mission. This paper
utilizes the deep reinforcement learning (DRL) method to address the
continuous-horizon decision-making problem on the highway. First, the vehicle
kinematics and driving scenario on the freeway are introduced. The running
objective of the ego automated vehicle is to execute an efficient and smooth
policy without collision. Then, the particular algorithm named proximal policy
optimization (PPO)-enhanced DRL is illustrated. To overcome the challenges in
tardy training efficiency and sample inefficiency, this applied algorithm could
realize high learning efficiency and excellent control performance. Finally,
the PPO-DRL-based decision-making strategy is estimated from multiple
perspectives, including the optimality, learning efficiency, and adaptability.
Its potential for online application is discussed by applying it to similar
driving scenarios.Comment: 9 pages, 10 figure
Graph Reinforcement Learning Application to Co-operative Decision-Making in Mixed Autonomy Traffic: Framework, Survey, and Challenges
Proper functioning of connected and automated vehicles (CAVs) is crucial for
the safety and efficiency of future intelligent transport systems. Meanwhile,
transitioning to fully autonomous driving requires a long period of mixed
autonomy traffic, including both CAVs and human-driven vehicles. Thus,
collaboration decision-making for CAVs is essential to generate appropriate
driving behaviors to enhance the safety and efficiency of mixed autonomy
traffic. In recent years, deep reinforcement learning (DRL) has been widely
used in solving decision-making problems. However, the existing DRL-based
methods have been mainly focused on solving the decision-making of a single
CAV. Using the existing DRL-based methods in mixed autonomy traffic cannot
accurately represent the mutual effects of vehicles and model dynamic traffic
environments. To address these shortcomings, this article proposes a graph
reinforcement learning (GRL) approach for multi-agent decision-making of CAVs
in mixed autonomy traffic. First, a generic and modular GRL framework is
designed. Then, a systematic review of DRL and GRL methods is presented,
focusing on the problems addressed in recent research. Moreover, a comparative
study on different GRL methods is further proposed based on the designed
framework to verify the effectiveness of GRL methods. Results show that the GRL
methods can well optimize the performance of multi-agent decision-making for
CAVs in mixed autonomy traffic compared to the DRL methods. Finally, challenges
and future research directions are summarized. This study can provide a
valuable research reference for solving the multi-agent decision-making
problems of CAVs in mixed autonomy traffic and can promote the implementation
of GRL-based methods into intelligent transportation systems. The source code
of our work can be found at https://github.com/Jacklinkk/Graph_CAVs.Comment: 22 pages, 7 figures, 10 tables. Currently under review at IEEE
Transactions on Intelligent Transportation System
Curriculum Proximal Policy Optimization with Stage-Decaying Clipping for Self-Driving at Unsignalized Intersections
Unsignalized intersections are typically considered as one of the most
representative and challenging scenarios for self-driving vehicles. To tackle
autonomous driving problems in such scenarios, this paper proposes a curriculum
proximal policy optimization (CPPO) framework with stage-decaying clipping. By
adjusting the clipping parameter during different stages of training through
proximal policy optimization (PPO), the vehicle can first rapidly search for an
approximate optimal policy or its neighborhood with a large parameter, and then
converges to the optimal policy with a small one. Particularly, the stage-based
curriculum learning technology is incorporated into the proposed framework to
improve the generalization performance and further accelerate the training
process. Moreover, the reward function is specially designed in view of
different curriculum settings. A series of comparative experiments are
conducted in intersection-crossing scenarios with bi-lane carriageways to
verify the effectiveness of the proposed CPPO method. The results show that the
proposed approach demonstrates better adaptiveness to different dynamic and
complex environments, as well as faster training speed over baseline methods.Comment: 7 pages, 4 figure
A Comparative Analysis of Deep Reinforcement Learning-enabled Freeway Decision-making for Automated Vehicles
Deep reinforcement learning (DRL) is becoming a prevalent and powerful
methodology to address the artificial intelligent problems. Owing to its
tremendous potentials in self-learning and self-improvement, DRL is broadly
serviced in many research fields. This article conducted a comprehensive
comparison of multiple DRL approaches on the freeway decision-making problem
for autonomous vehicles. These techniques include the common deep Q learning
(DQL), double DQL (DDQL), dueling DQL, and prioritized replay DQL. First, the
reinforcement learning (RL) framework is introduced. As an extension, the
implementations of the above mentioned DRL methods are established
mathematically. Then, the freeway driving scenario for the automated vehicles
is constructed, wherein the decision-making problem is transferred as a control
optimization problem. Finally, a series of simulation experiments are achieved
to evaluate the control performance of these DRL-enabled decision-making
strategies. A comparative analysis is realized to connect the autonomous
driving results with the learning characteristics of these DRL techniques.Comment: 11 pages, 10 figure
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Motion Planning for Autonomous Driving: The State of the Art and Future Perspectives
Thanks to the augmented convenience, safety advantages, and potential
commercial value, Intelligent vehicles (IVs) have attracted wide attention
throughout the world. Although a few autonomous driving unicorns assert that
IVs will be commercially deployable by 2025, their implementation is still
restricted to small-scale validation due to various issues, among which precise
computation of control commands or trajectories by planning methods remains a
prerequisite for IVs. This paper aims to review state-of-the-art planning
methods, including pipeline planning and end-to-end planning methods. In terms
of pipeline methods, a survey of selecting algorithms is provided along with a
discussion of the expansion and optimization mechanisms, whereas in end-to-end
methods, the training approaches and verification scenarios of driving tasks
are points of concern. Experimental platforms are reviewed to facilitate
readers in selecting suitable training and validation methods. Finally, the
current challenges and future directions are discussed. The side-by-side
comparison presented in this survey not only helps to gain insights into the
strengths and limitations of the reviewed methods but also assists with
system-level design choices.Comment: 20 pages, 14 figures and 5 table
- …