378 research outputs found
Fully Convolutional Neural Networks for Dynamic Object Detection in Grid Maps
Grid maps are widely used in robotics to represent obstacles in the
environment and differentiating dynamic objects from static infrastructure is
essential for many practical applications. In this work, we present a methods
that uses a deep convolutional neural network (CNN) to infer whether grid cells
are covering a moving object or not. Compared to tracking approaches, that use
e.g. a particle filter to estimate grid cell velocities and then make a
decision for individual grid cells based on this estimate, our approach uses
the entire grid map as input image for a CNN that inspects a larger area around
each cell and thus takes the structural appearance in the grid map into account
to make a decision. Compared to our reference method, our concept yields a
performance increase from 83.9% to 97.2%. A runtime optimized version of our
approach yields similar improvements with an execution time of just 10
milliseconds.Comment: This is a shorter version of the masters thesis of Florian Piewak and
it was accapted at IV 201
Understanding Convolution for Semantic Segmentation
Recent advances in deep learning, especially deep convolutional neural
networks (CNNs), have led to significant improvement over previous semantic
segmentation systems. Here we show how to improve pixel-wise semantic
segmentation by manipulating convolution-related operations that are of both
theoretical and practical value. First, we design dense upsampling convolution
(DUC) to generate pixel-level prediction, which is able to capture and decode
more detailed information that is generally missing in bilinear upsampling.
Second, we propose a hybrid dilated convolution (HDC) framework in the encoding
phase. This framework 1) effectively enlarges the receptive fields (RF) of the
network to aggregate global information; 2) alleviates what we call the
"gridding issue" caused by the standard dilated convolution operation. We
evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a
state-of-art result of 80.1% mIOU in the test set at the time of submission. We
also have achieved state-of-the-art overall on the KITTI road estimation
benchmark and the PASCAL VOC2012 segmentation task. Our source code can be
found at https://github.com/TuSimple/TuSimple-DUC .Comment: WACV 2018. Updated acknowledgements. Source code:
https://github.com/TuSimple/TuSimple-DU
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
Semantic segmentation is the pixel-wise labelling of an image. Since the
problem is defined at the pixel level, determining image class labels only is
not acceptable, but localising them at the original image pixel resolution is
necessary. Boosted by the extraordinary ability of convolutional neural
networks (CNN) in creating semantic, high level and hierarchical image
features; excessive numbers of deep learning-based 2D semantic segmentation
approaches have been proposed within the last decade. In this survey, we mainly
focus on the recent scientific developments in semantic segmentation,
specifically on deep learning-based methods using 2D images. We started with an
analysis of the public image sets and leaderboards for 2D semantic
segmantation, with an overview of the techniques employed in performance
evaluation. In examining the evolution of the field, we chronologically
categorised the approaches into three main periods, namely pre-and early deep
learning era, the fully convolutional era, and the post-FCN era. We technically
analysed the solutions put forward in terms of solving the fundamental problems
of the field, such as fine-grained localisation and scale invariance. Before
drawing our conclusions, we present a table of methods from all mentioned eras,
with a brief summary of each approach that explains their contribution to the
field. We conclude the survey by discussing the current challenges of the field
and to what extent they have been solved.Comment: Updated with new studie
ScarGAN: Chained Generative Adversarial Networks to Simulate Pathological Tissue on Cardiovascular MR Scans
Medical images with specific pathologies are scarce, but a large amount of
data is usually required for a deep convolutional neural network (DCNN) to
achieve good accuracy. We consider the problem of segmenting the left
ventricular (LV) myocardium on late gadolinium enhancement (LGE) cardiovascular
magnetic resonance (CMR) scans of which only some of the scans have scar
tissue. We propose ScarGAN to simulate scar tissue on healthy myocardium using
chained generative adversarial networks (GAN). Our novel approach factorizes
the simulation process into 3 steps: 1) a mask generator to simulate the shape
of the scar tissue; 2) a domain-specific heuristic to produce the initial
simulated scar tissue from the simulated shape; 3) a refining generator to add
details to the simulated scar tissue. Unlike other approaches that generate
samples from scratch, we simulate scar tissue on normal scans resulting in
highly realistic samples. We show that experienced radiologists are unable to
distinguish between real and simulated scar tissue. Training a U-Net with
additional scans with scar tissue simulated by ScarGAN increases the percentage
of scar pixels correctly included in LV myocardium prediction from 75.9% to
80.5%.Comment: 12 pages, 5 figures. To appear in MICCAI DLMIA 201
- …