56,001 research outputs found
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
In this work we address the task of semantic image segmentation with Deep
Learning and make three main contributions that are experimentally shown to
have substantial practical merit. First, we highlight convolution with
upsampled filters, or 'atrous convolution', as a powerful tool in dense
prediction tasks. Atrous convolution allows us to explicitly control the
resolution at which feature responses are computed within Deep Convolutional
Neural Networks. It also allows us to effectively enlarge the field of view of
filters to incorporate larger context without increasing the number of
parameters or the amount of computation. Second, we propose atrous spatial
pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP
probes an incoming convolutional feature layer with filters at multiple
sampling rates and effective fields-of-views, thus capturing objects as well as
image context at multiple scales. Third, we improve the localization of object
boundaries by combining methods from DCNNs and probabilistic graphical models.
The commonly deployed combination of max-pooling and downsampling in DCNNs
achieves invariance but has a toll on localization accuracy. We overcome this
by combining the responses at the final DCNN layer with a fully connected
Conditional Random Field (CRF), which is shown both qualitatively and
quantitatively to improve localization performance. Our proposed "DeepLab"
system sets the new state-of-art at the PASCAL VOC-2012 semantic image
segmentation task, reaching 79.7% mIOU in the test set, and advances the
results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and
Cityscapes. All of our code is made publicly available online.Comment: Accepted by TPAM
Object Recognition from very few Training Examples for Enhancing Bicycle Maps
In recent years, data-driven methods have shown great success for extracting
information about the infrastructure in urban areas. These algorithms are
usually trained on large datasets consisting of thousands or millions of
labeled training examples. While large datasets have been published regarding
cars, for cyclists very few labeled data is available although appearance,
point of view, and positioning of even relevant objects differ. Unfortunately,
labeling data is costly and requires a huge amount of work. In this paper, we
thus address the problem of learning with very few labels. The aim is to
recognize particular traffic signs in crowdsourced data to collect information
which is of interest to cyclists. We propose a system for object recognition
that is trained with only 15 examples per class on average. To achieve this, we
combine the advantages of convolutional neural networks and random forests to
learn a patch-wise classifier. In the next step, we map the random forest to a
neural network and transform the classifier to a fully convolutional network.
Thereby, the processing of full images is significantly accelerated and
bounding boxes can be predicted. Finally, we integrate data of the Global
Positioning System (GPS) to localize the predictions on the map. In comparison
to Faster R-CNN and other networks for object recognition or algorithms for
transfer learning, we considerably reduce the required amount of labeled data.
We demonstrate good performance on the recognition of traffic signs for
cyclists as well as their localization in maps.Comment: Submitted to IV 2018. This research was supported by German Research
Foundation DFG within Priority Research Programme 1894 "Volunteered
Geographic Information: Interpretation, Visualization and Social Computing
Real-Time Automatic Fetal Brain Extraction in Fetal MRI by Deep Learning
Brain segmentation is a fundamental first step in neuroimage analysis. In the
case of fetal MRI, it is particularly challenging and important due to the
arbitrary orientation of the fetus, organs that surround the fetal head, and
intermittent fetal motion. Several promising methods have been proposed but are
limited in their performance in challenging cases and in real-time
segmentation. We aimed to develop a fully automatic segmentation method that
independently segments sections of the fetal brain in 2D fetal MRI slices in
real-time. To this end, we developed and evaluated a deep fully convolutional
neural network based on 2D U-net and autocontext, and compared it to two
alternative fast methods based on 1) a voxelwise fully convolutional network
and 2) a method based on SIFT features, random forest and conditional random
field. We trained the networks with manual brain masks on 250 stacks of
training images, and tested on 17 stacks of normal fetal brain images as well
as 18 stacks of extremely challenging cases based on extreme motion, noise, and
severely abnormal brain shape. Experimental results show that our U-net
approach outperformed the other methods and achieved average Dice metrics of
96.52% and 78.83% in the normal and challenging test sets, respectively. With
an unprecedented performance and a test run time of about 1 second, our network
can be used to segment the fetal brain in real-time while fetal MRI slices are
being acquired. This can enable real-time motion tracking, motion detection,
and 3D reconstruction of fetal brain MRI.Comment: This work has been submitted to ISBI 201
- …