2,867 research outputs found

    Safe Hybrid-Action Reinforcement Learning-Based Decision and Control for Discretionary Lane Change

    Full text link
    Autonomous lane-change, a key feature of advanced driver-assistance systems, can enhance traffic efficiency and reduce the incidence of accidents. However, safe driving of autonomous vehicles remains challenging in complex environments. How to perform safe and appropriate lane change is a popular topic of research in the field of autonomous driving. Currently, few papers consider the safety of reinforcement learning in autonomous lane-change scenarios. We introduce safe hybrid-action reinforcement learning into discretionary lane change for the first time and propose Parameterized Soft Actor-Critic with PID Lagrangian (PASAC-PIDLag) algorithm. Furthermore, we conduct a comparative analysis of the Parameterized Soft Actor-Critic (PASAC), which is an unsafe version of PASAC-PIDLag. Both algorithms are employed to train the lane-change strategy of autonomous vehicles to output discrete lane-change decision and longitudinal vehicle acceleration. Our simulation results indicate that at a traffic density of 15 vehicles per kilometer (15 veh/km), the PASAC-PIDLag algorithm exhibits superior safety with a collision rate of 0%, outperforming the PASAC algorithm, which has a collision rate of 1%. The outcomes of the generalization assessments reveal that at low traffic density levels, both the PASAC-PIDLag and PASAC algorithms are proficient in attaining a 0% collision rate. Under conditions of high traffic flow density, the PASAC-PIDLag algorithm surpasses PASAC in terms of both safety and optimality

    Decision-making for Autonomous Vehicles on Highway: Deep Reinforcement Learning with Continuous Action Horizon

    Full text link
    Decision-making strategy for autonomous vehicles de-scribes a sequence of driving maneuvers to achieve a certain navigational mission. This paper utilizes the deep reinforcement learning (DRL) method to address the continuous-horizon decision-making problem on the highway. First, the vehicle kinematics and driving scenario on the freeway are introduced. The running objective of the ego automated vehicle is to execute an efficient and smooth policy without collision. Then, the particular algorithm named proximal policy optimization (PPO)-enhanced DRL is illustrated. To overcome the challenges in tardy training efficiency and sample inefficiency, this applied algorithm could realize high learning efficiency and excellent control performance. Finally, the PPO-DRL-based decision-making strategy is estimated from multiple perspectives, including the optimality, learning efficiency, and adaptability. Its potential for online application is discussed by applying it to similar driving scenarios.Comment: 9 pages, 10 figure

    Behavior planning for automated highway driving

    Get PDF
    This work deals with certain components of an automated driving system for highways, focusing on lane change behavior planning. It presents a variety of algorithms of a modular system aiming at safe and comfortable driving. A major contribution of this work is a method for analyzing traffic scenes in a spatio-temporal, curvilinear coordinate system. The results of this analysis are used in a further step to generate lane change trajectories. A total of three approaches with increasing levels of complexity and capabilities are compared. The most advanced approach formulates the problem as a linear-quadratic cooperative game and accounts for the inherently uncertain and multimodal nature of trajectory predictions for surrounding road users. Evaluations on real data show that the developed algorithms can be integrated into current generation automated driving software systems fulfilling runtime constraints

    Game Theoretic Decision Making by Actively Learning Human Intentions Applied on Autonomous Driving

    Full text link
    The ability to estimate human intentions and interact with human drivers intelligently is crucial for autonomous vehicles to successfully achieve their objectives. In this paper, we propose a game theoretic planning algorithm that models human opponents with an iterative reasoning framework and estimates human latent cognitive states through probabilistic inference and active learning. By modeling the interaction as a partially observable Markov decision process with adaptive state and action spaces, our algorithm is able to accomplish real-time lane changing tasks in a realistic driving simulator. We compare our algorithm's lane changing performance in dense traffic with a state-of-the-art autonomous lane changing algorithm to show the advantage of iterative reasoning and active learning in terms of avoiding overly conservative behaviors and achieving the driving objective successfully

    Deep Reinforcement Learning and Game Theoretic Monte Carlo Decision Process for Safe and Efficient Lane Change Maneuver and Speed Management

    Get PDF
    Predicting the states of the surrounding traffic is one of the major problems in automated driving. Maneuvers such as lane change, merge, and exit management could pose challenges in the absence of intervehicular communication and can benefit from driver behavior prediction. Predicting the motion of surrounding vehicles and trajectory planning need to be computationally efficient for real-time implementation. This dissertation presents a decision process model for real-time automated lane change and speed management in highway and urban traffic. In lane change and merge maneuvers, it is important to know how neighboring vehicles will act in the imminent future. Human driver models, probabilistic approaches, rule-base techniques, and machine learning approach have addressed this problem only partially as they do not focus on the behavioral features of the vehicles. The main goal of this research is to develop a fast algorithm that predicts the future states of the neighboring vehicles, runs a fast decision process, and learns the regretfulness and rewardfulness of the executed decisions. The presented algorithm is developed based on level-K game theory to model and predict the interaction between the vehicles. Using deep reinforcement learning, this algorithm encodes and memorizes the past experiences that are recurrently used to reduce the computations and speed up motion planning. Also, we use Monte Carlo Tree Search (MCTS) as an effective tool that is employed nowadays for fast planning in complex and dynamic game environments. This development leverages the computation power efficiently and showcases promising outcomes for maneuver planning and predicting the environment’s dynamics. In the absence of traffic connectivity that may be due to either passenger’s choice of privacy or the vehicle’s lack of technology, this development can be extended and employed in automated vehicles for real-world and practical applications

    Game Theoretic Model Predictive Control for Autonomous Driving

    Get PDF
    This study presents two closely-related solutions to autonomous vehicle control problems in highway driving scenario using game theory and model predictive control. We first develop a game theoretic four-stage model predictive controller (GT4SMPC). The controller is responsible for both longitudinal and lateral movements of Subject Vehicle (SV) . It includes a Stackelberg game as a high level controller and a model predictive controller (MPC) as a low level one. Specifically, GT4SMPC constantly establishes and solves games corresponding to multiple gaps in front of multiple-candidate vehicles (GCV) when SV is interacting with them by signaling a lane change intention through turning light or by a small lateral movement. SV’s payoff is the negative of the MPC’s cost function , which ensures strong connection between the game and that the solution of the game is more likely to be achieved by a hybrid MPC (HMPC). GCV’s payoff is a linear combination of the speed payoff, headway payoff and acceleration payoff. . We use decreasing acceleration model to generate our prediction of TV’s future motion, which is utilized in both defining TV’s payoffs over the prediction horizon in the game and as the reference of the MPC. Solving the games gives the optimal gap and the target vehicle (TV). In the low level , the lane change process are divided into four stages: traveling in the current lane, leaving current lane, crossing lane marking, traveling in the target lane. The division identifies the time that SV should initiate actual lateral movement for the lateral controller and specifies the constraints HMPC should deal at each step of the MPC prediction horizon. Then the four-stage HMPC controls SV’s actual longitudinal motion and execute the lane change at the right moment. Simulations showed the GT4SMPC is able to intelligently drive SV into the selected gap and accomplish both discretionary land change (DLC) and mandatory lane change (MLC) in a dynamic situation. Human-in-the-loop driving simulation indicated that GT4SMPC can decently control the SV to complete lane changes with the presence of human drivers. Second, we propose a differential game theoretic model predictive controller (DGTMPC) to address the drawbacks of GT4SMPC. In GT4SMPC, the games are defined as table game, which indicates each players only have limited amount of choices for a specific game and such choice remain fixed during the prediction horizon. In addition, we assume a known model for traffic vehicles but in reality drivers’ preference is partly unknown. In order to allow the TV to make multiple decisions within the prediction horizon and to measure TV’s driving style on-line, we propose a differential game theoretic model predictive controller (DGTMPC). The high level of the hierarchical DGTMPC is the two-player differential lane-change Stackelberg game. We assume each player uses a MPC to control its motion and the optimal solution of leaders’ MPC depends on the solution of the follower. Therefore, we convert this differential game problem into a bi-level optimization problem and solves the problem with the branch and bound algorithm. Besides the game, we propose an inverse model predictive control algorithm (IMPC) to estimate the MPC weights of other drivers on-line based on surrounding vehicle’s real-time behavior, assuming they are controlled by MPC as well. The estimation results contribute to a more appropriate solution to the game against driver of specific type. The solution of the algorithm indicates the future motion of the TV, which can be used as the reference for the low level controller. The low level HMPC controls both the longitudinal motion of SV and his real-time lane decision. Simulations showed that the DGTMPC can well identify the weights traffic vehicles’ MPC cost function and behave intelligently during the interaction. Comparison with level-k controller indicates DGTMPC’s Superior performance
    • …
    corecore