9,742 research outputs found
Combining Planning and Deep Reinforcement Learning in Tactical Decision Making for Autonomous Driving
Tactical decision making for autonomous driving is challenging due to the
diversity of environments, the uncertainty in the sensor information, and the
complex interaction with other road users. This paper introduces a general
framework for tactical decision making, which combines the concepts of planning
and learning, in the form of Monte Carlo tree search and deep reinforcement
learning. The method is based on the AlphaGo Zero algorithm, which is extended
to a domain with a continuous state space where self-play cannot be used. The
framework is applied to two different highway driving cases in a simulated
environment and it is shown to perform better than a commonly used baseline
method. The strength of combining planning and learning is also illustrated by
a comparison to using the Monte Carlo tree search or the neural network policy
separately
Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search
Today's automated vehicles lack the ability to cooperate implicitly with
others. This work presents a Monte Carlo Tree Search (MCTS) based approach for
decentralized cooperative planning using macro-actions for automated vehicles
in heterogeneous environments. Based on cooperative modeling of other agents
and Decoupled-UCT (a variant of MCTS), the algorithm evaluates the
state-action-values of each agent in a cooperative and decentralized manner,
explicitly modeling the interdependence of actions between traffic
participants. Macro-actions allow for temporal extension over multiple time
steps and increase the effective search depth requiring fewer iterations to
plan over longer horizons. Without predefined policies for macro-actions, the
algorithm simultaneously learns policies over and within macro-actions. The
proposed method is evaluated under several conflict scenarios, showing that the
algorithm can achieve effective cooperative planning with learned macro-actions
in heterogeneous environments
Deep Reinforcement-Learning-based Driving Policy for Autonomous Road Vehicles
In this work the problem of path planning for an autonomous vehicle that
moves on a freeway is considered. The most common approaches that are used to
address this problem are based on optimal control methods, which make
assumptions about the model of the environment and the system dynamics. On the
contrary, this work proposes the development of a driving policy based on
reinforcement learning. In this way, the proposed driving policy makes minimal
or no assumptions about the environment, since a priori knowledge about the
system dynamics is not required. Driving scenarios where the road is occupied
both by autonomous and manual driving vehicles are considered. To the best of
our knowledge, this is one of the first approaches that propose a reinforcement
learning driving policy for mixed driving environments. The derived
reinforcement learning policy, firstly, is compared against an optimal policy
derived via dynamic programming, and, secondly, its efficiency is evaluated
under realistic scenarios generated by the established SUMO microscopic traffic
flow simulator. Finally, some initial results regarding the effect of
autonomous vehicles' behavior on the overall traffic flow are presented.Comment: 19 pages. arXiv admin note: substantial text overlap with
arXiv:1905.0904
Decision-Making in Autonomous Driving using Reinforcement Learning
The main topic of this thesis is tactical decision-making for autonomous driving. An autonomous vehicle must be able to handle a diverse set of environments and traffic situations, which makes it hard to manually specify a suitable behavior for every possible scenario. Therefore, learning-based strategies are considered in this thesis, which introduces different approaches based on reinforcement learning (RL). A general decision-making agent, derived from the Deep Q-Network (DQN) algorithm, is proposed. With few modifications, this method can be applied to different driving environments, which is demonstrated for various simulated highway and intersection scenarios. A more sample efficient agent can be obtained by incorporating more domain knowledge, which is explored by combining planning and learning in the form of Monte Carlo tree search and RL. In different highway scenarios, the combined method outperforms using either a planning or a learning-based strategy separately, while requiring an order of magnitude fewer training samples than the DQN method. A drawback of many learning-based approaches is that they create black-box solutions, which do not indicate the confidence of the agent\u27s decisions. Therefore, the Ensemble Quantile Networks (EQN) method is introduced, which combines distributional RL with an ensemble approach, to provide an estimate of both the aleatoric and the epistemic uncertainty of each decision. The results show that the EQN method can balance risk and time efficiency in different occluded intersection scenarios, while also identifying situations that the agent has not been trained for. Thereby, the agent can avoid making unfounded, potentially dangerous, decisions outside of the training distribution. Finally, this thesis introduces a neural network architecture that is invariant to permutations of the order in which surrounding vehicles are listed. This architecture improves the sample efficiency of the agent by the factorial of the number of surrounding vehicles
Ecological IVIS design : using EID to develop a novel in-vehicle information system
New in-vehicle information systems (IVIS) are emerging which purport to encourage more environment friendly or ‘green’ driving. Meanwhile, wider concerns about road safety and in-car distractions remain. The ‘Foot-LITE’ project is an effort to balance these issues, aimed at achieving safer and greener driving through real-time driving information, presented via an in-vehicle interface which facilitates the desired behaviours while avoiding negative consequences. One way of achieving this is to use ecological interface design (EID) techniques. This article presents part of the formative human-centred design process for developing the in-car display through a series of rapid prototyping studies comparing EID against conventional interface design principles. We focus primarily on the visual display, although some development of an ecological auditory display is also presented. The results of feedback from potential users as well as subject matter experts are discussed with respect to implications for future interface design in this field
Smart Master Production Schedule for the Supply Chain: A Conceptual Framework
[EN] Risks arising from the effect of disruptions and unsustainable practices constantly push the supply chain to uncompetitive positions. A smart production planning and control process must successfully address both risks by reducing them, thereby strengthening supply chain (SC) resilience and its ability to survive in the long term. On the one hand, the antidisruptive potential and the inherent sustainability implications of the zero-defect manufacturing (ZDM) management model should be highlighted. On the other hand, the digitization and virtualization of processes by Industry 4.0 (I4.0) digital technologies, namely digital twin (DT) technology, enable new simulation and optimization methods, especially in combination with machine learning (ML) procedures. This paper reviews the state of the art and proposes a ZDM strategy-based conceptual framework that models, optimizes and simulates the master production schedule (MPS) problem to maximize service levels in SCs. This conceptual framework will serve as a starting point for developing new MPS optimization models and algorithms in supply chain 4.0 (SC4.0) environments.The research leading to these results received funding from the European Union H2020 Program with grant agreements No. 825631 "Zero-Defect Manufacturing Platform (ZDMP)" and No. 958205 "Industrial Data Services for Quality Control in Smart Manufacturing (i4Q)", and from Grant RTI2018-101344-B-I00 funded by MCIN/AEI/10.13039/501100011033 and by "ERDF A way of making Europe".Serrano-Ruiz, JC.; Mula, J.; Poler, R. (2021). Smart Master Production Schedule for the Supply Chain: A Conceptual Framework. Computers. 10(12):1-24. https://doi.org/10.3390/computers10120156124101
- …