9 research outputs found

    Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning

    Full text link
    The majority of current studies on autonomous vehicle control via deep reinforcement learning (DRL) utilize point-mass kinematic models, neglecting vehicle dynamics which includes acceleration delay and acceleration command dynamics. The acceleration delay, which results from sensing and actuation delays, results in delayed execution of the control inputs. The acceleration command dynamics dictates that the actual vehicle acceleration does not rise up to the desired command acceleration instantaneously due to dynamics. In this work, we investigate the feasibility of applying DRL controllers trained using vehicle kinematic models to more realistic driving control with vehicle dynamics. We consider a particular longitudinal car-following control, i.e., Adaptive Cruise Control (ACC), problem solved via DRL using a point-mass kinematic model. When such a controller is applied to car following with vehicle dynamics, we observe significantly degraded car-following performance. Therefore, we redesign the DRL framework to accommodate the acceleration delay and acceleration command dynamics by adding the delayed control inputs and the actual vehicle acceleration to the reinforcement learning environment state, respectively. The training results show that the redesigned DRL controller results in near-optimal control performance of car following with vehicle dynamics considered when compared with dynamic programming solutions.Comment: Accepted to 2019 IEEE Intelligent Transportation Systems Conferenc

    Explanation-Aware Experience Replay in Rule-Dense Environments

    Get PDF
    Human environments are often regulated by explicit and complex rulesets. Integrating Reinforcement Learning (RL) agents into such environments motivates the development of learning mechanisms that perform well in rule-dense and exception-ridden environments such as autonomous driving on regulated roads. In this letter, we propose a method for organising experience by means of partitioning the experience buffer into clusters labelled on a per-explanation basis. We present discrete and continuous navigation environments compatible with modular rulesets and 9 learning tasks. For environments with explainable rulesets, we convert rule-based explanations into case-based explanations by allocating state-transitions into clusters labelled with explanations. This allows us to sample experiences in a curricular and task-oriented manner, focusing on the rarity, importance, and meaning of events. We label this concept Explanation-Awareness (XA). We perform XA experience replay (XAER) with intra and inter-cluster prioritisation, and introduce XA-compatible versions of DQN, TD3, and SAC. Performance is consistently superior with XA versions of those algorithms, compared to traditional Prioritised Experience Replay baselines, indicating that explanation engineering can be used in lieu of reward engineering for environments with explainable features

    A Practical Guide to Multi-Objective Reinforcement Learning and Planning

    Get PDF
    Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems

    A practical guide to multi-objective reinforcement learning and planning

    Get PDF
    Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems. © 2022, The Author(s)

    Dynamic-Occlusion-Aware Risk Identification for Autonomous Vehicles Using Hypergames

    Get PDF
    A particular challenge for both autonomous vehicles (AV) and human drivers is dealing with risk associated with dynamic occlusion, i.e., occlusion caused by other vehicles in traffic. In order to overcome this challenge, we use the theory of hypergames to develop a novel dynamic-occlusion risk measure (DOR). We use DOR to evaluate the safety of strategic planners, a type of AV behaviour planner that reasons over the assumptions other road users have of each other. We also present a method for augmenting naturalistic driving data to artificially generate occlusion situations. Combining our risk identification and occlusion generation methods, we are able to discover occlusion-caused collisions (OCC), which rarely occur in naturalistic driving data. Using our method we are able to increase the number of dynamic-occlusion situations in naturalistic data by a factor of 70, which allows us to increase the number of OCCs we can discover in naturalistic data by a factor of 40. We show that the generated OCCs are realistic and cover a diverse range of configurations. We then characterize the nature of OCCs at intersections by presenting an OCC taxonomy, which categorizes OCCs based on if they are left-turning or right-turning situations, and if they are reveal or tagging-on situations. Finally, in order to analyze the impact of collisions, we perform a severity analysis, where we find that the majority of OCCs result in high-impact collisions, demonstrating the need to evaluate AVs under occlusion situations before they can be released for commercial use
    corecore