Search CORE

3,079 research outputs found

자율 주행 차량의 심층강화학습 기반 긴급 차선 변경 경로 최적화

Author: 송준
Publication venue: 서울대학교 대학원
Publication date: 01/08/2021
Field of study

학위논문(박사) -- 서울대학교대학원 : 공과대학 기계항공공학부, 2021.8. 최예림.The emergency lane change is a risk itself because it is made instantaneously in emergency such as a sudden stop of the vehicle in front in the driving lane. Therefore, the optimization of the lane change trajectory is an essential research area of autonomous vehicle. This research proposes a path optimization for emergency lane change of autonomous vehicles based on deep reinforcement learning. This algorithm is developed with a focus on fast and safe avoidance behavior and lane change in an emergency. As the first step of algorithm development, a simulation environment was established. IPG CARMAKER was selected for reliable vehicle dynamics simulation and construction of driving scenarios for reinforcement learning. This program is a highly reliable and can analyze the behavior of a vehicle similar to that of a real vehicle. In this research, a simulation was performed using the Hyundai I30-PDe full car model. And as a simulator for DRL and vehicle control, Matlab Simulink which can encompass all of control, measurement, and artificial intelligence was selected. By connecting two simulators, the emergency lane change trajectory is optimized based on DRL. The vehicle lane change trajectory is modeled as a 3rd order polynomial. The start and end point of the lane change is set and analyzed as a function of the lane change distance for the coefficient of the polynomial. In order to optimize the coefficients. A DRL architecture is constructed. 12 types of driving environment data are used for the observation space. And lane change distance which is a variable of polynomial is selected as the output of action space. Reward space is designed to maximize the learning ability. Dynamic & static reward and penalty are given at each time step of simulation, so that optimization proceeds in a direction in which the accumulated rewards could be maximized. Deep Deterministic Policy Gradient agent is used as an algorithm for optimization. An algorithm is developed for driving a vehicle in a dynamic simulation program. First, an algorithm is developed that can determine when, at what velocity, and in which direction to change the lane of a vehicle in an emergency situation. By estimating the maximum tire-road friction coefficient in real-time, the minimum distance for the driving vehicle to stop is calculated to determine the risk of longitudinal collision with the vehicle in front. Also, using Gipps’ safety distance formula, an algorithm is developed that detects the possibility of a collision with a vehicle coming from the lane to be changed, and determines whether to overtake the vehicle to pass forward or to go backward after as being overtaken. Based on this, the decision-making algorithm for the final lane change is developed by determine the collision risk and safety of the left and right lanes. With the developed algorithm that outputs the emergency lane change trajectory through the configured reinforcement learning structure and the general driving trajectory such as the lane keeping algorithm and the adaptive cruise control algorithm according to the situation, an integrated algorithm that drives the ego vehicle through the adaptive model predictive controller is developed. As the last step of the research, DRL was performed to optimize the developed emergency lane change path optimization algorithm. 60,000 trial-and-error learning is performed to develop the algorithm for each driving situation, and performance is evaluated through test driving.긴급 차선 변경은 주행 차선에서 선행차량 급정거와 같은 응급상황 발생시에 순간적으로 이루어지는 것이므로 그 자체에 위험성을 안고 있다. 지나치게 느리게 조향을 하는 경우, 주행 차량은 앞에 있는 장애물과의 충돌을 피할 수 없다. 이와 반대로 지나치게 빠르게 조향을 하는 경우, 차량과 지면 사이의 작용력은 타이어 마찰 한계를 넘게 된다. 이는 차량의 조종 안정성을 떨어트려 스핀이나 전복 등 다른 양상의 사고를 야기한다. 따라서 차선 변경 경로의 최적화는 자율 주행 차량의 응급 상황 대처에 필수적인 요소이다. 본 논문에서는 심층강화학습을 기반으로 자율 주행 차량의 긴급 차선 변경 경로를 최적화한다. 이 알고리즘은 선행차량의 급정거나 장애물 출현과 같은 응급상황 발생 시, 빠르고 안전한 회피 거동 및 차선 변경에 초점을 맞추어 개발되었다. 알고리즘 개발의 첫 번째 단계로서 시뮬레이션 환경이 구축되었다. 신뢰성 있는 차량 동역학 시뮬레이션과 강화학습을 위한 주행 시나리오 구축을 위하여 IPG CARMAKER가 선정되었다. 이 프로그램은 실제 산업 현장에서 사용되는 높은 신뢰성을 가진 프로그램으로 실제 차량과 유사한 차량의 거동을 분석할 수 있다. 본 연구에서는 현대자동차의 I30-PDe 모델을 사용하여 시뮬레이션을 수행하였다. 또한 강화학습과 차량제어를 위한 프로그램으로 제어, 계측, 인공지능을 모두 아우를 수 있는 Matlab Simulink를 선정하였다. 본 연구에서는 IPG CARMAKER와 Matlab Simulink를 연동하여 심층 강화 학습을 바탕으로 긴급 차선 변경 궤적을 최적화하였다. 차량의 차선 변경 궤적은 3차 다항식의 형상으로 모델링 되었다. 차선 변경 시작 지점과 종료 지점을 설정하여 다항식의 계수를 차선 변경 거리에 대한 함수로 해석하였다. 심층 강화 학습을 기반으로 계수들을 최적화하기 위하여, 강화 학습 아키텍처를 구성하였다. 관측 공간은 12가지의 주행 환경 데이터를 이용하였고, 강화 학습의 출력으로는 3차 함수의 변수인 차선 변경 거리를 선정하였다. 그리고 강화 학습의 학습 능력을 극대화할 수 있는 보상 공간을 설계하였다. 동적 보상, 정적 보상, 동적 벌칙, 정적 벌칙을 시뮬레이션의 매 단계마다 부여함으로써 보상 총 합이 최대화될 수 있는 방향으로 학습이 진행되었다. 최적화를 위한 알고리즘으로는 Deep Deterministic Policy Gradient agent가 사용되었다. 강화학습 아키텍처와 함께 동역학 시뮬레이션 프로그램에서의 차량 구동을 위한 알고리즘을 개발하였다. 먼저 응급상황시에 차량의 차선을 언제, 어떤 속도로, 어떤 방향으로 변경할 지 결정하는 의사결정 알고리즘을 개발하였다. 타이어와 도로 사이의 최대 마찰계수를 실시간으로 추정하여 주행 차량이 정지하기 위한 최소 거리를 산출함으로써 선행 차량과의 충돌 위험을 판단하였다. 또한 Gipps의 안전거리 공식을 사용하여 변경하고자 하는 차선에서 오는 차량과의 충돌 가능성을 감지하여 그 차량을 추월해서 앞으로 지나갈지, 추월을 당해서 뒤로 갈 것인지를 결정하는 알고리즘을 개발하였다. 이를 바탕으로 좌측 차선과 우측 차선의 충돌 위험성 및 안정성을 판단하여 최종적인 차선 변경을 위한 의사결정 알고리즘을 개발하였다. 구성된 강화 학습 구조를 통한 긴급 차선 변경 궤적과 차선 유지 장치, 적응형 순항 제어와 같은 일반 주행시의 궤적을 상황에 맞추어 출력하는 알고리즘을 개발하고 적응형 모델 예측 제어기를 통해 주행 차량을 구동하는 통합 알고리즘을 개발하였다. 본 연구의 마지막 단계로서, 개발된 긴급 차선 변경 경로 생성 알고리즘의 최적화를 위하여 심층 강화 학습이 수행되었다. 총 60,000회의 시행 착오 방식의 학습을 통해 각 주행 상황 별 최적의 차선 변경 제어 알고리즘을 개발하였고, 각 주행상황 별 최적의 차선 변경 궤적을 제시하였다.Chapter 1. Introduction 1 1.1. Research Background 1 1.2. Previous Research 5 1.3. Research Objective 9 1.4. Dissertation Overview 13 Chapter 2. Simulation Environment 19 2.1. Simulator 19 2.2. Scenario 26 Chapter 3. Methodology 28 3.1. Reinforcement learning 28 3.2. Deep reinforcement learning 30 3.3. Neural network 33 Chapter 4. DRL-enhanced Lane Change 36 4.1. Necessity of Evasive Steering Trajectory Optimization 36 4.2. Trajectory Planning 39 4.3. DRL Structure 42 4.3.1. Observation 43 4.3.2. Action 47 4.3.3. Reward 49 4.3.4. Neural Network Architecture 58 4.3.5. Deep Deterministic Policy Gradient (DDPG) Agent 60 Chapter 5. Autonomous Driving Algorithm Integration 64 5.1. Lane Change Decision Making 65 5.1.1. Longitudinal Collision Detection 66 5.1.2. Lateral Collision Detection 71 5.1.3. Lane Change Direction Decision 74 5.2. Path Planning 75 5.3. Vehicle Controller 76 5.4. Algorithm Integration 77 Chapter 6. Training & Results 79 Chapter 7. Conclusion 91 References 97 국문초록 104박

SNU Open Repository and Archive

Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder

Author: Hu Cong
Hu Yue
Qin Long
Yin Quanjun
Zeng Junjie
Publication venue: 'MDPI AG'
Publication date: 05/11/2018
Field of study

In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.Comment: 20 page

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Safety of autonomous vehicles: A survey on Model-based vs. AI-based approaches

Author: Adouane Lounis
Iberraken Dimia
Publication venue
Publication date: 29/05/2023
Field of study

The growing advancements in Autonomous Vehicles (AVs) have emphasized the critical need to prioritize the absolute safety of AV maneuvers, especially in dynamic and unpredictable environments or situations. This objective becomes even more challenging due to the uniqueness of every traffic situation/condition. To cope with all these very constrained and complex configurations, AVs must have appropriate control architectures with reliable and real-time Risk Assessment and Management Strategies (RAMS). These targeted RAMS must lead to reduce drastically the navigation risks. However, the lack of safety guarantees proves, which is one of the key challenges to be addressed, limit drastically the ambition to introduce more broadly AVs on our roads and restrict the use of AVs to very limited use cases. Therefore, the focus and the ambition of this paper is to survey research on autonomous vehicles while focusing on the important topic of safety guarantee of AVs. For this purpose, it is proposed to review research on relevant methods and concepts defining an overall control architecture for AVs, with an emphasis on the safety assessment and decision-making systems composing these architectures. Moreover, it is intended through this reviewing process to highlight researches that use either model-based methods or AI-based approaches. This is performed while emphasizing the strengths and weaknesses of each methodology and investigating the research that proposes a comprehensive multi-modal design that combines model-based and AI approaches. This paper ends with discussions on the methods used to guarantee the safety of AVs namely: safety verification techniques and the standardization/generalization of safety frameworks

arXiv.org e-Print Archive