9,742 research outputs found

    Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving

    Full text link
    Tactical decision making for autonomous driving is challenging due to the diversity of environments, the uncertainty in the sensor information, and the complex interaction with other road users. This paper introduces a general framework for tactical decision making, which combines the concepts of planning and learning, in the form of Monte Carlo tree search and deep reinforcement learning. The method is based on the AlphaGo Zero algorithm, which is extended to a domain with a continuous state space where self-play cannot be used. The framework is applied to two different highway driving cases in a simulated environment and it is shown to perform better than a commonly used baseline method. The strength of combining planning and learning is also illustrated by a comparison to using the Monte Carlo tree search or the neural network policy separately

    Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search

    Full text link
    Today's automated vehicles lack the ability to cooperate implicitly with others. This work presents a Monte Carlo Tree Search (MCTS) based approach for decentralized cooperative planning using macro-actions for automated vehicles in heterogeneous environments. Based on cooperative modeling of other agents and Decoupled-UCT (a variant of MCTS), the algorithm evaluates the state-action-values of each agent in a cooperative and decentralized manner, explicitly modeling the interdependence of actions between traffic participants. Macro-actions allow for temporal extension over multiple time steps and increase the effective search depth requiring fewer iterations to plan over longer horizons. Without predefined policies for macro-actions, the algorithm simultaneously learns policies over and within macro-actions. The proposed method is evaluated under several conflict scenarios, showing that the algorithm can achieve effective cooperative planning with learned macro-actions in heterogeneous environments

    Deep Reinforcement-Learning-based Driving Policy for Autonomous Road Vehicles

    Get PDF
    In this work the problem of path planning for an autonomous vehicle that moves on a freeway is considered. The most common approaches that are used to address this problem are based on optimal control methods, which make assumptions about the model of the environment and the system dynamics. On the contrary, this work proposes the development of a driving policy based on reinforcement learning. In this way, the proposed driving policy makes minimal or no assumptions about the environment, since a priori knowledge about the system dynamics is not required. Driving scenarios where the road is occupied both by autonomous and manual driving vehicles are considered. To the best of our knowledge, this is one of the first approaches that propose a reinforcement learning driving policy for mixed driving environments. The derived reinforcement learning policy, firstly, is compared against an optimal policy derived via dynamic programming, and, secondly, its efficiency is evaluated under realistic scenarios generated by the established SUMO microscopic traffic flow simulator. Finally, some initial results regarding the effect of autonomous vehicles' behavior on the overall traffic flow are presented.Comment: 19 pages. arXiv admin note: substantial text overlap with arXiv:1905.0904

    Decision-Making in Autonomous Driving using Reinforcement Learning

    Get PDF
    The main topic of this thesis is tactical decision-making for autonomous driving. An autonomous vehicle must be able to handle a diverse set of environments and traffic situations, which makes it hard to manually specify a suitable behavior for every possible scenario. Therefore, learning-based strategies are considered in this thesis, which introduces different approaches based on reinforcement learning (RL). A general decision-making agent, derived from the Deep Q-Network (DQN) algorithm, is proposed. With few modifications, this method can be applied to different driving environments, which is demonstrated for various simulated highway and intersection scenarios. A more sample efficient agent can be obtained by incorporating more domain knowledge, which is explored by combining planning and learning in the form of Monte Carlo tree search and RL. In different highway scenarios, the combined method outperforms using either a planning or a learning-based strategy separately, while requiring an order of magnitude fewer training samples than the DQN method. A drawback of many learning-based approaches is that they create black-box solutions, which do not indicate the confidence of the agent\u27s decisions. Therefore, the Ensemble Quantile Networks (EQN) method is introduced, which combines distributional RL with an ensemble approach, to provide an estimate of both the aleatoric and the epistemic uncertainty of each decision. The results show that the EQN method can balance risk and time efficiency in different occluded intersection scenarios, while also identifying situations that the agent has not been trained for. Thereby, the agent can avoid making unfounded, potentially dangerous, decisions outside of the training distribution. Finally, this thesis introduces a neural network architecture that is invariant to permutations of the order in which surrounding vehicles are listed. This architecture improves the sample efficiency of the agent by the factorial of the number of surrounding vehicles

    Ecological IVIS design : using EID to develop a novel in-vehicle information system

    Get PDF
    New in-vehicle information systems (IVIS) are emerging which purport to encourage more environment friendly or ‘green’ driving. Meanwhile, wider concerns about road safety and in-car distractions remain. The ‘Foot-LITE’ project is an effort to balance these issues, aimed at achieving safer and greener driving through real-time driving information, presented via an in-vehicle interface which facilitates the desired behaviours while avoiding negative consequences. One way of achieving this is to use ecological interface design (EID) techniques. This article presents part of the formative human-centred design process for developing the in-car display through a series of rapid prototyping studies comparing EID against conventional interface design principles. We focus primarily on the visual display, although some development of an ecological auditory display is also presented. The results of feedback from potential users as well as subject matter experts are discussed with respect to implications for future interface design in this field

    Smart Master Production Schedule for the Supply Chain: A Conceptual Framework

    Full text link
    [EN] Risks arising from the effect of disruptions and unsustainable practices constantly push the supply chain to uncompetitive positions. A smart production planning and control process must successfully address both risks by reducing them, thereby strengthening supply chain (SC) resilience and its ability to survive in the long term. On the one hand, the antidisruptive potential and the inherent sustainability implications of the zero-defect manufacturing (ZDM) management model should be highlighted. On the other hand, the digitization and virtualization of processes by Industry 4.0 (I4.0) digital technologies, namely digital twin (DT) technology, enable new simulation and optimization methods, especially in combination with machine learning (ML) procedures. This paper reviews the state of the art and proposes a ZDM strategy-based conceptual framework that models, optimizes and simulates the master production schedule (MPS) problem to maximize service levels in SCs. This conceptual framework will serve as a starting point for developing new MPS optimization models and algorithms in supply chain 4.0 (SC4.0) environments.The research leading to these results received funding from the European Union H2020 Program with grant agreements No. 825631 "Zero-Defect Manufacturing Platform (ZDMP)" and No. 958205 "Industrial Data Services for Quality Control in Smart Manufacturing (i4Q)", and from Grant RTI2018-101344-B-I00 funded by MCIN/AEI/10.13039/501100011033 and by "ERDF A way of making Europe".Serrano-Ruiz, JC.; Mula, J.; Poler, R. (2021). Smart Master Production Schedule for the Supply Chain: A Conceptual Framework. Computers. 10(12):1-24. https://doi.org/10.3390/computers10120156124101
    • …
    corecore