15,426 research outputs found
MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving
Autonomous driving requires operation in different behavioral modes ranging
from lane following and intersection crossing to turning and stopping. However,
most existing deep learning approaches to autonomous driving do not consider
the behavioral mode in the training strategy. This paper describes a technique
for learning multiple distinct behavioral modes in a single deep neural network
through the use of multi-modal multi-task learning. We study the effectiveness
of this approach, denoted MultiNet, using self-driving model cars for driving
in unstructured environments such as sidewalks and unpaved roads. Using labeled
data from over one hundred hours of driving our fleet of 1/10th scale model
cars, we trained different neural networks to predict the steering angle and
driving speed of the vehicle in different behavioral modes. We show that in
each case, MultiNet networks outperform networks trained on individual modes
while using a fraction of the total number of parameters.Comment: Published in IEEE WACV 201
Towards Visual Ego-motion Learning in Robots
Many model-based Visual Odometry (VO) algorithms have been proposed in the
past decade, often restricted to the type of camera optics, or the underlying
motion manifold observed. We envision robots to be able to learn and perform
these tasks, in a minimally supervised setting, as they gain more experience.
To this end, we propose a fully trainable solution to visual ego-motion
estimation for varied camera optics. We propose a visual ego-motion learning
architecture that maps observed optical flow vectors to an ego-motion density
estimate via a Mixture Density Network (MDN). By modeling the architecture as a
Conditional Variational Autoencoder (C-VAE), our model is able to provide
introspective reasoning and prediction for ego-motion induced scene-flow.
Additionally, our proposed model is especially amenable to bootstrapped
ego-motion learning in robots where the supervision in ego-motion estimation
for a particular camera sensor can be obtained from standard navigation-based
sensor fusion strategies (GPS/INS and wheel-odometry fusion). Through
experiments, we show the utility of our proposed approach in enabling the
concept of self-supervised learning for visual ego-motion estimation in
autonomous robots.Comment: Conference paper; Submitted to IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS) 2017, Vancouver CA; 8 pages, 8 figures,
2 table
How hard is it to cross the room? -- Training (Recurrent) Neural Networks to steer a UAV
This work explores the feasibility of steering a drone with a (recurrent)
neural network, based on input from a forward looking camera, in the context of
a high-level navigation task. We set up a generic framework for training a
network to perform navigation tasks based on imitation learning. It can be
applied to both aerial and land vehicles. As a proof of concept we apply it to
a UAV (Unmanned Aerial Vehicle) in a simulated environment, learning to cross a
room containing a number of obstacles. So far only feedforward neural networks
(FNNs) have been used to train UAV control. To cope with more complex tasks, we
propose the use of recurrent neural networks (RNN) instead and successfully
train an LSTM (Long-Short Term Memory) network for controlling UAVs. Vision
based control is a sequential prediction problem, known for its highly
correlated input data. The correlation makes training a network hard,
especially an RNN. To overcome this issue, we investigate an alternative
sampling method during training, namely window-wise truncated backpropagation
through time (WW-TBPTT). Further, end-to-end training requires a lot of data
which often is not available. Therefore, we compare the performance of
retraining only the Fully Connected (FC) and LSTM control layers with networks
which are trained end-to-end. Performing the relatively simple task of crossing
a room already reveals important guidelines and good practices for training
neural control networks. Different visualizations help to explain the behavior
learned.Comment: 12 pages, 30 figure
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Data Fusion Methods and Algorithms in the Context of Autonomous Systems - A path planning algorithms analysis and optimization exploiting fused data
L'abstract è presente nell'allegato / the abstract is in the attachmen
- …