22,633 research outputs found
Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation
As the demand for enabling high-level autonomous driving has increased in
recent years and visual perception is one of the critical features to enable
fully autonomous driving, in this paper, we introduce an efficient approach for
simultaneous object detection, depth estimation and pixel-level semantic
segmentation using a shared convolutional architecture. The proposed network
model, which we named Driving Scene Perception Network (DSPNet), uses
multi-level feature maps and multi-task learning to improve the accuracy and
efficiency of object detection, depth estimation and image segmentation tasks
from a single input image. Hence, the resulting network model uses less than
850 MiB of GPU memory and achieves 14.0 fps on NVIDIA GeForce GTX 1080 with a
1024x512 input image, and both precision and efficiency have been improved over
combination of single tasks.Comment: 9 pages, 7 figures, WACV'1
A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
Semantic segmentation is the pixel-wise labelling of an image. Since the
problem is defined at the pixel level, determining image class labels only is
not acceptable, but localising them at the original image pixel resolution is
necessary. Boosted by the extraordinary ability of convolutional neural
networks (CNN) in creating semantic, high level and hierarchical image
features; excessive numbers of deep learning-based 2D semantic segmentation
approaches have been proposed within the last decade. In this survey, we mainly
focus on the recent scientific developments in semantic segmentation,
specifically on deep learning-based methods using 2D images. We started with an
analysis of the public image sets and leaderboards for 2D semantic
segmantation, with an overview of the techniques employed in performance
evaluation. In examining the evolution of the field, we chronologically
categorised the approaches into three main periods, namely pre-and early deep
learning era, the fully convolutional era, and the post-FCN era. We technically
analysed the solutions put forward in terms of solving the fundamental problems
of the field, such as fine-grained localisation and scale invariance. Before
drawing our conclusions, we present a table of methods from all mentioned eras,
with a brief summary of each approach that explains their contribution to the
field. We conclude the survey by discussing the current challenges of the field
and to what extent they have been solved.Comment: Updated with new studie
Recurrent Segmentation for Variable Computational Budgets
State-of-the-art systems for semantic image segmentation use feed-forward
pipelines with fixed computational costs. Building an image segmentation system
that works across a range of computational budgets is challenging and
time-intensive as new architectures must be designed and trained for every
computational setting. To address this problem we develop a recurrent neural
network that successively improves prediction quality with each iteration.
Importantly, the RNN may be deployed across a range of computational budgets by
merely running the model for a variable number of iterations. We find that this
architecture is uniquely suited for efficiently segmenting videos. By
exploiting the segmentation of past frames, the RNN can perform video
segmentation at similar quality but reduced computational cost compared to
state-of-the-art image segmentation methods. When applied to static images in
the PASCAL VOC 2012 and Cityscapes segmentation datasets, the RNN traces out a
speed-accuracy curve that saturates near the performance of state-of-the-art
segmentation methods
Improving the Segmentation of Anatomical Structures in Chest Radiographs using U-Net with an ImageNet Pre-trained Encoder
Accurate segmentation of anatomical structures in chest radiographs is
essential for many computer-aided diagnosis tasks. In this paper we investigate
the latest fully-convolutional architectures for the task of multi-class
segmentation of the lungs field, heart and clavicles in a chest radiograph. In
addition, we explore the influence of using different loss functions in the
training process of a neural network for semantic segmentation. We evaluate all
models on a common benchmark of 247 X-ray images from the JSRT database and
ground-truth segmentation masks from the SCR dataset. Our best performing
architecture, is a modified U-Net that benefits from pre-trained encoder
weights. This model outperformed the current state-of-the-art methods tested on
the same benchmark, with Jaccard overlap scores of 96.1% for lung fields, 90.6%
for heart and 85.5% for clavicles.Comment: Presented at the First International Workshop on Thoracic Image
Analysis (TIA), MICCAI 201
- …