10,023 research outputs found
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps
Grid maps are widely used in robotics to represent obstacles in the
environment and differentiating dynamic objects from static infrastructure is
essential for many practical applications. In this work, we present a methods
that uses a deep convolutional neural network (CNN) to infer whether grid cells
are covering a moving object or not. Compared to tracking approaches, that use
e.g. a particle filter to estimate grid cell velocities and then make a
decision for individual grid cells based on this estimate, our approach uses
the entire grid map as input image for a CNN that inspects a larger area around
each cell and thus takes the structural appearance in the grid map into account
to make a decision. Compared to our reference method, our concept yields a
performance increase from 83.9% to 97.2%. A runtime optimized version of our
approach yields similar improvements with an execution time of just 10
milliseconds.Comment: This is a shorter version of the masters thesis of Florian Piewak and
it was accapted at IV 201
Deformable Convolutional Networks
Convolutional neural networks (CNNs) are inherently limited to model
geometric transformations due to the fixed geometric structures in its building
modules. In this work, we introduce two new modules to enhance the
transformation modeling capacity of CNNs, namely, deformable convolution and
deformable RoI pooling. Both are based on the idea of augmenting the spatial
sampling locations in the modules with additional offsets and learning the
offsets from target tasks, without additional supervision. The new modules can
readily replace their plain counterparts in existing CNNs and can be easily
trained end-to-end by standard back-propagation, giving rise to deformable
convolutional networks. Extensive experiments validate the effectiveness of our
approach on sophisticated vision tasks of object detection and semantic
segmentation. The code would be released
Fast LIDAR-based Road Detection Using Fully Convolutional Neural Networks
In this work, a deep learning approach has been developed to carry out road
detection using only LIDAR data. Starting from an unstructured point cloud,
top-view images encoding several basic statistics such as mean elevation and
density are generated. By considering a top-view representation, road detection
is reduced to a single-scale problem that can be addressed with a simple and
fast fully convolutional neural network (FCN). The FCN is specifically designed
for the task of pixel-wise semantic segmentation by combining a large receptive
field with high-resolution feature maps. The proposed system achieved excellent
performance and it is among the top-performing algorithms on the KITTI road
benchmark. Its fast inference makes it particularly suitable for real-time
applications
Dynamic Occupancy Grid Prediction for Urban Autonomous Driving: A Deep Learning Approach with Fully Automatic Labeling
Long-term situation prediction plays a crucial role in the development of
intelligent vehicles. A major challenge still to overcome is the prediction of
complex downtown scenarios with multiple road users, e.g., pedestrians, bikes,
and motor vehicles, interacting with each other. This contribution tackles this
challenge by combining a Bayesian filtering technique for environment
representation, and machine learning as long-term predictor. More specifically,
a dynamic occupancy grid map is utilized as input to a deep convolutional
neural network. This yields the advantage of using spatially distributed
velocity estimates from a single time step for prediction, rather than a raw
data sequence, alleviating common problems dealing with input time series of
multiple sensors. Furthermore, convolutional neural networks have the inherent
characteristic of using context information, enabling the implicit modeling of
road user interaction. Pixel-wise balancing is applied in the loss function
counteracting the extreme imbalance between static and dynamic cells. One of
the major advantages is the unsupervised learning character due to fully
automatic label generation. The presented algorithm is trained and evaluated on
multiple hours of recorded sensor data and compared to Monte-Carlo simulation
CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data
This paper presents a novel method for ground segmentation in Velodyne point
clouds. We propose an encoding of sparse 3D data from the Velodyne sensor
suitable for training a convolutional neural network (CNN). This general
purpose approach is used for segmentation of the sparse point cloud into ground
and non-ground points. The LiDAR data are represented as a multi-channel 2D
signal where the horizontal axis corresponds to the rotation angle and the
vertical axis the indexes channels (i.e. laser beams). Multiple topologies of
relatively shallow CNNs (i.e. 3-5 convolutional layers) are trained and
evaluated using a manually annotated dataset we prepared. The results show
significant improvement of performance over the state-of-the-art method by
Zhang et al. in terms of speed and also minor improvements in terms of
accuracy.Comment: ICRA 2018 submissio
Spatially Adaptive Computation Time for Residual Networks
This paper proposes a deep learning architecture based on Residual Network
that dynamically adjusts the number of executed layers for the regions of the
image. This architecture is end-to-end trainable, deterministic and
problem-agnostic. It is therefore applicable without any modifications to a
wide range of computer vision problems such as image classification, object
detection and image segmentation. We present experimental results showing that
this model improves the computational efficiency of Residual Networks on the
challenging ImageNet classification and COCO object detection datasets.
Additionally, we evaluate the computation time maps on the visual saliency
dataset cat2000 and find that they correlate surprisingly well with human eye
fixation positions.Comment: CVPR 201
- …