4 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Trajectory Planning Under Vehicle Dimension Constraints Using Sequential Linear Programming
This paper presents a spatial-based trajectory planning method for automated vehicles under actuator, obstacle avoidance, and vehicle dimension constraints. Starting from a nonlinear kinematic bicycle model, vehicle dynamics are transformed to a road-aligned coordinate frame with path along the road centerline replacing time as the dependent variable. Space-varying vehicle dimension constraints are linearized around a reference path to pose convex optimization problems. Such constraints do not require to inflate obstacles by safety-margins and therefore maximize performance in very constrained environments. A sequential linear programming (SLP) algorithm is motivated. A linear program (LP) is solved at each SLP-iteration. The relation between LP formulation and maximum admissible traveling speeds within vehicle tire friction limits is discussed. The proposed method is evaluated in a roomy and in a tight maneuvering driving scenario, whereby a comparison to a semi-analytical clothoid-based path planner is given. Effectiveness is demonstrated particularly for very constrained environments, requiring to account for constraints and planning over the entire obstacle constellation space.QC 20180618</p