Curriculum Proximal Policy Optimization with Stage-Decaying Clipping for
  Self-Driving at Unsignalized Intersections

Liu, Ming; Ma, Jun; Peng, Zengqi; Wang, Yubin; Zheng, Lei; Zhou, Xiao

Curriculum Proximal Policy Optimization with Stage-Decaying Clipping for Self-Driving at Unsignalized Intersections

Authors: Ming Liu
Jun Ma
Zengqi Peng
Yubin Wang
Lei Zheng
Xiao Zhou
Publication date: 31 August 2023
Publisher

Abstract

Unsignalized intersections are typically considered as one of the most representative and challenging scenarios for self-driving vehicles. To tackle autonomous driving problems in such scenarios, this paper proposes a curriculum proximal policy optimization (CPPO) framework with stage-decaying clipping. By adjusting the clipping parameter during different stages of training through proximal policy optimization (PPO), the vehicle can first rapidly search for an approximate optimal policy or its neighborhood with a large parameter, and then converges to the optimal policy with a small one. Particularly, the stage-based curriculum learning technology is incorporated into the proposed framework to improve the generalization performance and further accelerate the training process. Moreover, the reward function is specially designed in view of different curriculum settings. A series of comparative experiments are conducted in intersection-crossing scenarios with bi-lane carriageways to verify the effectiveness of the proposed CPPO method. The results show that the proposed approach demonstrates better adaptiveness to different dynamic and complex environments, as well as faster training speed over baseline methods.Comment: 7 pages, 4 figure

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2308.16445

Last time updated on 09/09/2023