611 research outputs found

    Autonomous Driving with Deep Reinforcement Learning

    Get PDF
    The researcher developed an autonomous driving simulation by training an end-to-end policy model using deep reinforcement learning algorithms in the Gym-duckietown virtual environment. The control strategy of the model was designed for the lane-following task. Several reinforcement learning algorithms were implemented and the SAC algorithm was chosen to train a non-end-to-end model with the information provided by the environment such as speed as input values, as well as an end-to-end model with images captured by the agent's front camera as input. In this paper, the researcher compared the advantages and disadvantages of the two models using kinetic parameters in the environment and conducted a series of experiments on the control strategy of the end-to-end model to explore the effects of different environmental parameters or reward functions on the models.:CHAPTER 1 INTRODUCTION 1 1.1 AUTONOMOUS DRIVING OVERVIEW 1 1.2 RESEARCH QUESTIONS AND METHODS 3 1.2.1 Research Questions 3 1.2.2 Research Methods 4 1.3 PAPER STRUCTURE 5 CHAPTER 2 RESEARCH BACKGROUND 7 2.1 RESEARCH STATUS 7 2.2 THEORETICAL BASIS 8 2.2.1 Machine Learning 8 2.2.2 Deep Learning 9 2.2.3 Reinforcement Learning 11 2.2.4 Deep Reinforcement Learning 14 CHAPTER 3 METHOD 15 3.1 SIMULATION PLATFORM 16 3.2 CONTROL TASK 17 3.3 OBSERVATION SPACE 18 3.3.1 Information as Observation (Non-end-to-end) 19 3.3.2 Images as Observation (End-to-end) 20 3.4 ACTION SPACE 22 3.5 ALGORITHM 23 3.5.1 Mathematical Foundations 23 3.5.2 Policy Iteration 25 3.6 POLICY ARCHITECTURE 25 3.6.1 Network Architecture for Non-end-to-end Model 26 3.6.2 Network Architecture for End-to-end Model 28 3.7 REWARD SHAPING 29 3.7.1 Calculation of Speed-based Reward Function 30 3.7.2 Calculation of the reward function based on the position of the agent relative to the right lane 31 CHAPTER 4 TRAINING PROCESS 33 4.1 TRAINING PROCESS OF NON-END-TO-END MODEL 34 4.2 TRAINING PROCESS OF END-TO-END MODEL 35 CHAPTER 5 RESULT 38 CHAPTER 6 TEST AND EVALUATION 41 6.1 EVALUATION OF END-TO-END MODEL 43 6.1.1 Speed Tests in Two Scenarios 43 6.1.2 Lateral Deviation between the Agent and the Right Lane’s Centerline 44 6.1.3 Orientation Deviation between the Agent and the Right Lane’s Centerline 45 6.2 COMPARISON OF THE END-TO-END MODEL TO TWO BASELINES IN SIMULATION 46 6.2.1 Comparison with Non-end-to-end Baseline 47 6.2.2 Comparison with PD Baseline 51 6.3 TEST THE EFFECT OF DIFFERENT WEIGHTS ASSIGNMENTS ON THE END-TO-END MODEL 53 CHAPTER 7 CONCLUSION 57Der Forscher entwickelte eine autonome Fahrsimulation, indem er ein End-to-End-Regelungsmodell mit Hilfe von Deep Reinforcement Learning-Algorithmen in der virtuellen Umgebung von Gym-duckietown trainierte. Die Kontrollstrategie des Modells wurde für die Aufgabe des Spurhaltens entwickelt. Es wurden mehrere Verstärkungslernalgorithmen implementiert, und der SAC-Algorithmus wurde ausgewählt, um ein Nicht-End-to-End-Modell mit den von der Umgebung bereitgestellten Informationen wie Geschwindigkeit als Eingabewerte sowie ein End-to-End-Modell mit den von der Frontkamera des Agenten aufgenommenen Bildern als Eingabe zu trainieren. In diesem Beitrag verglich der Forscher die Vor- und Nachteile der beiden Modelle unter Verwendung kinetischer Parameter in der Umgebung und führte eine Reihe von Experimenten zur Kontrollstrategie des End-to-End-Modells durch, um die Auswirkungen verschiedener Umgebungsparameter oder Belohnungsfunktionen auf die Modelle zu untersuchen.:CHAPTER 1 INTRODUCTION 1 1.1 AUTONOMOUS DRIVING OVERVIEW 1 1.2 RESEARCH QUESTIONS AND METHODS 3 1.2.1 Research Questions 3 1.2.2 Research Methods 4 1.3 PAPER STRUCTURE 5 CHAPTER 2 RESEARCH BACKGROUND 7 2.1 RESEARCH STATUS 7 2.2 THEORETICAL BASIS 8 2.2.1 Machine Learning 8 2.2.2 Deep Learning 9 2.2.3 Reinforcement Learning 11 2.2.4 Deep Reinforcement Learning 14 CHAPTER 3 METHOD 15 3.1 SIMULATION PLATFORM 16 3.2 CONTROL TASK 17 3.3 OBSERVATION SPACE 18 3.3.1 Information as Observation (Non-end-to-end) 19 3.3.2 Images as Observation (End-to-end) 20 3.4 ACTION SPACE 22 3.5 ALGORITHM 23 3.5.1 Mathematical Foundations 23 3.5.2 Policy Iteration 25 3.6 POLICY ARCHITECTURE 25 3.6.1 Network Architecture for Non-end-to-end Model 26 3.6.2 Network Architecture for End-to-end Model 28 3.7 REWARD SHAPING 29 3.7.1 Calculation of Speed-based Reward Function 30 3.7.2 Calculation of the reward function based on the position of the agent relative to the right lane 31 CHAPTER 4 TRAINING PROCESS 33 4.1 TRAINING PROCESS OF NON-END-TO-END MODEL 34 4.2 TRAINING PROCESS OF END-TO-END MODEL 35 CHAPTER 5 RESULT 38 CHAPTER 6 TEST AND EVALUATION 41 6.1 EVALUATION OF END-TO-END MODEL 43 6.1.1 Speed Tests in Two Scenarios 43 6.1.2 Lateral Deviation between the Agent and the Right Lane’s Centerline 44 6.1.3 Orientation Deviation between the Agent and the Right Lane’s Centerline 45 6.2 COMPARISON OF THE END-TO-END MODEL TO TWO BASELINES IN SIMULATION 46 6.2.1 Comparison with Non-end-to-end Baseline 47 6.2.2 Comparison with PD Baseline 51 6.3 TEST THE EFFECT OF DIFFERENT WEIGHTS ASSIGNMENTS ON THE END-TO-END MODEL 53 CHAPTER 7 CONCLUSION 5

    Eigenmodes of a reflective twisted-nematic liquid-crystal cell

    Get PDF
    The eigenmodes of a reflective twisted-nematic liquid-crystal (TN LC) cell are analyzed based on the Jones matrix method. Two models are developed to describe the LC system in the low- and high-voltage regimes. The simulated transmission spectra of LC Fabry-Perot etalons are used to investigate the eigenmodes. In a general reflective TN LC cell, the eigenmodes are two orthogonal linear polarization states. Under some specific conditions, these two linear polarization states will approach the bisector or orthogonal to the bisector of the TN LC cell. This bisector effect is useful for reducing the operating voltage and enhancing the contrast ratio of LC display devices and for eliminating the mode coupling of the tunable Fabry-Perot etalons
    corecore