941 research outputs found
BlitzNet: A Real-Time Deep Network for Scene Understanding
Real-time scene understanding has become crucial in many applications such as
autonomous driving. In this paper, we propose a deep architecture, called
BlitzNet, that jointly performs object detection and semantic segmentation in
one forward pass, allowing real-time computations. Besides the computational
gain of having a single network to perform several tasks, we show that object
detection and semantic segmentation benefit from each other in terms of
accuracy. Experimental results for VOC and COCO datasets show state-of-the-art
performance for object detection and segmentation among real time systems
Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections
In this paper, we propose a very deep fully convolutional encoding-decoding
framework for image restoration such as denoising and super-resolution. The
network is composed of multiple layers of convolution and de-convolution
operators, learning end-to-end mappings from corrupted images to the original
ones. The convolutional layers act as the feature extractor, which capture the
abstraction of image contents while eliminating noises/corruptions.
De-convolutional layers are then used to recover the image details. We propose
to symmetrically link convolutional and de-convolutional layers with skip-layer
connections, with which the training converges much faster and attains a
higher-quality local optimum. First, The skip connections allow the signal to
be back-propagated to bottom layers directly, and thus tackles the problem of
gradient vanishing, making training deep networks easier and achieving
restoration performance gains consequently. Second, these skip connections pass
image details from convolutional layers to de-convolutional layers, which is
beneficial in recovering the original image. Significantly, with the large
capacity, we can handle different levels of noises using a single model.
Experimental results show that our network achieves better performance than all
previously reported state-of-the-art methods.Comment: Accepted to Proc. Advances in Neural Information Processing Systems
(NIPS'16). Content of the final version may be slightly different. Extended
version is available at http://arxiv.org/abs/1606.0892
Understanding Convolution for Semantic Segmentation
Recent advances in deep learning, especially deep convolutional neural
networks (CNNs), have led to significant improvement over previous semantic
segmentation systems. Here we show how to improve pixel-wise semantic
segmentation by manipulating convolution-related operations that are of both
theoretical and practical value. First, we design dense upsampling convolution
(DUC) to generate pixel-level prediction, which is able to capture and decode
more detailed information that is generally missing in bilinear upsampling.
Second, we propose a hybrid dilated convolution (HDC) framework in the encoding
phase. This framework 1) effectively enlarges the receptive fields (RF) of the
network to aggregate global information; 2) alleviates what we call the
"gridding issue" caused by the standard dilated convolution operation. We
evaluate our approaches thoroughly on the Cityscapes dataset, and achieve a
state-of-art result of 80.1% mIOU in the test set at the time of submission. We
also have achieved state-of-the-art overall on the KITTI road estimation
benchmark and the PASCAL VOC2012 segmentation task. Our source code can be
found at https://github.com/TuSimple/TuSimple-DUC .Comment: WACV 2018. Updated acknowledgements. Source code:
https://github.com/TuSimple/TuSimple-DU
- …