1,470 research outputs found
Recommended from our members
Explainable and Advisable Learning for Self-driving Vehicles
Deep neural perception and control networks are likely to be a key component of self-driving vehicles. These models need to be explainable - they should provide easy-to-interpret rationales for their behavior - so that passengers, insurance companies, law enforcement, developers, etc., can understand what triggered a particular behavior. Explanations may be triggered by the neural controller, namely introspective explanations, or informed by the neural controller's output, namely rationalizations. Our work has focused on the challenge of generating introspective explanations of deep models for self-driving vehicles. In Chapter 3, we begin by exploring the use of visual explanations. These explanations take the form of real-time highlighted regions of an image that causally influence the network's output (steering control). In the first stage, we use a visual attention model to train a convolution network end-to-end from images to steering angle. The attention model highlights image regions that potentially influence the network's output. Some of these are true influences, but some are spurious. We then apply a causal filtering step to determine which input regions actually influence the output. This produces more succinct visual explanations and more accurately exposes the network's behavior. In Chapter 4, we add an attention-based video-to-text model to produce textual explanations of model actions, e.g. "the car slows down because the road is wet". The attention maps of controller and explanation model are aligned so that explanations are grounded in the parts of the scene that mattered to the controller. We explore two approaches to attention alignment, strong- and weak-alignment. These explainable systems represent an externalization of tacit knowledge. The network's opaque reasoning is simplified to a situation-specific dependence on a visible object in the image. This makes them brittle and potentially unsafe in situations that do not match training data. In Chapter 5, we propose to address this issue by augmenting training data with natural language advice from a human. Advice includes guidance about what to do and where to attend. We present the first step toward advice-giving, where we train an end-to-end vehicle controller that accepts advice. The controller adapts the way it attends to the scene (visual attention) and the control (steering and speed). Further, in Chapter 6, we propose a new approach that learns vehicle control with the help of long-term (global) human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (e.g. "I see a pedestrian crossing, so I stop"), and predict the controls, accordingly
DeepSignals: Predicting Intent of Drivers Through Visual Signals
Detecting the intention of drivers is an essential task in self-driving,
necessary to anticipate sudden events like lane changes and stops. Turn signals
and emergency flashers communicate such intentions, providing seconds of
potentially critical reaction time. In this paper, we propose to detect these
signals in video sequences by using a deep neural network that reasons about
both spatial and temporal information. Our experiments on more than a million
frames show high per-frame accuracy in very challenging scenarios.Comment: To be presented at the IEEE International Conference on Robotics and
Automation (ICRA), 201
Context Aware Road-user Importance Estimation (iCARE)
Road-users are a critical part of decision-making for both self-driving cars
and driver assistance systems. Some road-users, however, are more important for
decision-making than others because of their respective intentions, ego
vehicle's intention and their effects on each other. In this paper, we propose
a novel architecture for road-user importance estimation which takes advantage
of the local and global context of the scene. For local context, the model
exploits the appearance of the road users (which captures orientation,
intention, etc.) and their location relative to ego-vehicle. The global context
in our model is defined based on the feature map of the convolutional layer of
the module which predicts the future path of the ego-vehicle and contains rich
global information of the scene (e.g., infrastructure, road lanes, etc.), as
well as the ego vehicle's intention information. Moreover, this paper
introduces a new data set of real-world driving, concentrated around
inter-sections and includes annotations of important road users. Systematic
evaluations of our proposed method against several baselines show promising
results.Comment: Published in: IEEE Intelligent Vehicles (IV), 201
Simulation-based reinforcement learning for real-world autonomous driving
We use reinforcement learning in simulation to obtain a driving system
controlling a full-size real-world vehicle. The driving policy takes RGB images
from a single camera and their semantic segmentation as input. We use mostly
synthetic data, with labelled real-world data appearing only in the training of
the segmentation network.
Using reinforcement learning in simulation and synthetic data is motivated by
lowering costs and engineering effort.
In real-world experiments we confirm that we achieved successful sim-to-real
policy transfer. Based on the extensive evaluation, we analyze how design
decisions about perception, control, and training impact the real-world
performance
- …