2,867 research outputs found
Safe Hybrid-Action Reinforcement Learning-Based Decision and Control for Discretionary Lane Change
Autonomous lane-change, a key feature of advanced driver-assistance systems,
can enhance traffic efficiency and reduce the incidence of accidents. However,
safe driving of autonomous vehicles remains challenging in complex
environments. How to perform safe and appropriate lane change is a popular
topic of research in the field of autonomous driving. Currently, few papers
consider the safety of reinforcement learning in autonomous lane-change
scenarios. We introduce safe hybrid-action reinforcement learning into
discretionary lane change for the first time and propose Parameterized Soft
Actor-Critic with PID Lagrangian (PASAC-PIDLag) algorithm. Furthermore, we
conduct a comparative analysis of the Parameterized Soft Actor-Critic (PASAC),
which is an unsafe version of PASAC-PIDLag. Both algorithms are employed to
train the lane-change strategy of autonomous vehicles to output discrete
lane-change decision and longitudinal vehicle acceleration. Our simulation
results indicate that at a traffic density of 15 vehicles per kilometer (15
veh/km), the PASAC-PIDLag algorithm exhibits superior safety with a collision
rate of 0%, outperforming the PASAC algorithm, which has a collision rate of
1%. The outcomes of the generalization assessments reveal that at low traffic
density levels, both the PASAC-PIDLag and PASAC algorithms are proficient in
attaining a 0% collision rate. Under conditions of high traffic flow density,
the PASAC-PIDLag algorithm surpasses PASAC in terms of both safety and
optimality
Decision-making for Autonomous Vehicles on Highway: Deep Reinforcement Learning with Continuous Action Horizon
Decision-making strategy for autonomous vehicles de-scribes a sequence of
driving maneuvers to achieve a certain navigational mission. This paper
utilizes the deep reinforcement learning (DRL) method to address the
continuous-horizon decision-making problem on the highway. First, the vehicle
kinematics and driving scenario on the freeway are introduced. The running
objective of the ego automated vehicle is to execute an efficient and smooth
policy without collision. Then, the particular algorithm named proximal policy
optimization (PPO)-enhanced DRL is illustrated. To overcome the challenges in
tardy training efficiency and sample inefficiency, this applied algorithm could
realize high learning efficiency and excellent control performance. Finally,
the PPO-DRL-based decision-making strategy is estimated from multiple
perspectives, including the optimality, learning efficiency, and adaptability.
Its potential for online application is discussed by applying it to similar
driving scenarios.Comment: 9 pages, 10 figure
Behavior planning for automated highway driving
This work deals with certain components of an automated driving
system for highways, focusing on lane change behavior planning. It
presents a variety of algorithms of a modular system aiming at safe and
comfortable driving. A major contribution of this work is a method for
analyzing traffic scenes in a spatio-temporal, curvilinear coordinate
system. The results of this analysis are used in a further step to generate
lane change trajectories. A total of three approaches with increasing
levels of complexity and capabilities are compared. The most advanced
approach formulates the problem as a linear-quadratic cooperative
game and accounts for the inherently uncertain and multimodal nature
of trajectory predictions for surrounding road users. Evaluations on real
data show that the developed algorithms can be integrated into current
generation automated driving software systems fulfilling runtime
constraints
Game Theoretic Decision Making by Actively Learning Human Intentions Applied on Autonomous Driving
The ability to estimate human intentions and interact with human drivers
intelligently is crucial for autonomous vehicles to successfully achieve their
objectives. In this paper, we propose a game theoretic planning algorithm that
models human opponents with an iterative reasoning framework and estimates
human latent cognitive states through probabilistic inference and active
learning. By modeling the interaction as a partially observable Markov decision
process with adaptive state and action spaces, our algorithm is able to
accomplish real-time lane changing tasks in a realistic driving simulator. We
compare our algorithm's lane changing performance in dense traffic with a
state-of-the-art autonomous lane changing algorithm to show the advantage of
iterative reasoning and active learning in terms of avoiding overly
conservative behaviors and achieving the driving objective successfully
Deep Reinforcement Learning and Game Theoretic Monte Carlo Decision Process for Safe and Efficient Lane Change Maneuver and Speed Management
Predicting the states of the surrounding traffic is one of the major problems in automated driving. Maneuvers such as lane change, merge, and exit management could pose challenges in the absence of intervehicular communication and can benefit from driver behavior prediction. Predicting the motion of surrounding vehicles and trajectory planning need to be computationally efficient for real-time implementation. This dissertation presents a decision process model for real-time automated lane change and speed management in highway and urban traffic. In lane change and merge maneuvers, it is important to know how neighboring vehicles will act in the imminent future. Human driver models, probabilistic approaches, rule-base techniques, and machine learning approach have addressed this problem only partially as they do not focus on the behavioral features of the vehicles. The main goal of this research is to develop a fast algorithm that predicts the future states of the neighboring vehicles, runs a fast decision process, and learns the regretfulness and rewardfulness of the executed decisions. The presented algorithm is developed based on level-K game theory to model and predict the interaction between the vehicles. Using deep reinforcement learning, this algorithm encodes and memorizes the past experiences that are recurrently used to reduce the computations and speed up motion planning. Also, we use Monte Carlo Tree Search (MCTS) as an effective tool that is employed nowadays for fast planning in complex and dynamic game environments. This development leverages the computation power efficiently and showcases promising outcomes for maneuver planning and predicting the environment’s dynamics. In the absence of traffic connectivity that may be due to either passenger’s choice of privacy or the vehicle’s lack of technology, this development can be extended and employed in automated vehicles for real-world and practical applications
Game Theoretic Model Predictive Control for Autonomous Driving
This study presents two closely-related solutions to autonomous vehicle control problems in highway driving scenario using game theory and model predictive control. We first develop a game theoretic four-stage model predictive controller (GT4SMPC). The controller is responsible for both longitudinal and lateral movements of Subject Vehicle (SV) . It includes a Stackelberg game as a high level controller and a model predictive controller (MPC) as a low level one. Specifically, GT4SMPC constantly establishes and solves games corresponding to multiple gaps in front of multiple-candidate vehicles (GCV) when SV is interacting with them by signaling a lane change intention through turning light or by a small lateral movement. SV’s payoff is the negative of the MPC’s cost function , which ensures strong connection between the game and that the solution of the game is more likely to be achieved by a hybrid MPC (HMPC). GCV’s payoff is a linear combination of the speed payoff, headway payoff and acceleration payoff. . We use decreasing acceleration model to generate our prediction of TV’s future motion, which is utilized in both defining TV’s payoffs over the prediction horizon in the game and as the reference of the MPC. Solving the games gives the optimal gap and the target vehicle (TV). In the low level , the lane change process are divided into four stages: traveling in the current lane, leaving current lane, crossing lane marking, traveling in the target lane. The division identifies the time that SV should initiate actual lateral movement for the lateral controller and specifies the constraints HMPC should deal at each step of the MPC prediction horizon. Then the four-stage HMPC controls SV’s actual longitudinal motion and execute the lane change at the right moment. Simulations showed the GT4SMPC is able to intelligently drive SV into the selected gap and accomplish both discretionary land change (DLC) and mandatory lane change (MLC) in a dynamic situation. Human-in-the-loop driving simulation indicated that GT4SMPC can decently control the SV to complete lane changes with the presence of human drivers. Second, we propose a differential game theoretic model predictive controller (DGTMPC) to address the drawbacks of GT4SMPC. In GT4SMPC, the games are defined as table game, which indicates each players only have limited amount of choices for a specific game and such choice remain fixed during the prediction horizon. In addition, we assume a known model for traffic vehicles but in reality drivers’ preference is partly unknown. In order to allow the TV to make multiple decisions within the prediction horizon and to measure TV’s driving style on-line, we propose a differential game theoretic model predictive controller (DGTMPC). The high level of the hierarchical DGTMPC is the two-player differential lane-change Stackelberg game. We assume each player uses a MPC to control its motion and the optimal solution of leaders’ MPC depends on the solution of the follower. Therefore, we convert this differential game problem into a bi-level optimization problem and solves the problem with the branch and bound algorithm. Besides the game, we propose an inverse model predictive control algorithm (IMPC) to estimate the MPC weights of other drivers on-line based on surrounding vehicle’s real-time behavior, assuming they are controlled by MPC as well. The estimation results contribute to a more appropriate solution to the game against driver of specific type. The solution of the algorithm indicates the future motion of the TV, which can be used as the reference for the low level controller. The low level HMPC controls both the longitudinal motion of SV and his real-time lane decision. Simulations showed that the DGTMPC can well identify the weights traffic vehicles’ MPC cost function and behave intelligently during the interaction. Comparison with level-k controller indicates DGTMPC’s Superior performance
- …