18,455 research outputs found
MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification
We introduce MultiDepth, a novel training strategy and convolutional neural
network (CNN) architecture that allows approaching single-image depth
estimation (SIDE) as a multi-task problem. SIDE is an important part of road
scene understanding. It, thus, plays a vital role in advanced driver assistance
systems and autonomous vehicles. Best results for the SIDE task so far have
been achieved using deep CNNs. However, optimization of regression problems,
such as estimating depth, is still a challenging task. For the related tasks of
image classification and semantic segmentation, numerous CNN-based methods with
robust training behavior have been proposed. Hence, in order to overcome the
notorious instability and slow convergence of depth value regression during
training, MultiDepth makes use of depth interval classification as an auxiliary
task. The auxiliary task can be disabled at test-time to predict continuous
depth values using the main regression branch more efficiently. We applied
MultiDepth to road scenes and present results on the KITTI depth prediction
dataset. In experiments, we were able to show that end-to-end multi-task
learning with both, regression and classification, is able to considerably
improve training and yield more accurate results.Comment: Accepted for presentation at the IEEE Intelligent Transportation
Systems Conference (ITSC) 201
Discriminative Training of Deep Fully-connected Continuous CRF with Task-specific Loss
Recent works on deep conditional random fields (CRF) have set new records on
many vision tasks involving structured predictions. Here we propose a
fully-connected deep continuous CRF model for both discrete and continuous
labelling problems. We exemplify the usefulness of the proposed model on
multi-class semantic labelling (discrete) and the robust depth estimation
(continuous) problems.
In our framework, we model both the unary and the pairwise potential
functions as deep convolutional neural networks (CNN), which are jointly
learned in an end-to-end fashion. The proposed method possesses the main
advantage of continuously-valued CRF, which is a closed-form solution for the
Maximum a posteriori (MAP) inference.
To better adapt to different tasks, instead of using the commonly employed
maximum likelihood CRF parameter learning protocol, we propose task-specific
loss functions for learning the CRF parameters.
It enables direct optimization of the quality of the MAP estimates during the
course of learning.
Specifically, we optimize the multi-class classification loss for the
semantic labelling task and the Turkey's biweight loss for the robust depth
estimation problem.
Experimental results on the semantic labelling and robust depth estimation
tasks demonstrate that the proposed method compare favorably against both
baseline and state-of-the-art methods.
In particular, we show that although the proposed deep CRF model is
continuously valued, with the equipment of task-specific loss, it achieves
impressive results even on discrete labelling tasks
Driving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation
As the demand for enabling high-level autonomous driving has increased in
recent years and visual perception is one of the critical features to enable
fully autonomous driving, in this paper, we introduce an efficient approach for
simultaneous object detection, depth estimation and pixel-level semantic
segmentation using a shared convolutional architecture. The proposed network
model, which we named Driving Scene Perception Network (DSPNet), uses
multi-level feature maps and multi-task learning to improve the accuracy and
efficiency of object detection, depth estimation and image segmentation tasks
from a single input image. Hence, the resulting network model uses less than
850 MiB of GPU memory and achieves 14.0 fps on NVIDIA GeForce GTX 1080 with a
1024x512 input image, and both precision and efficiency have been improved over
combination of single tasks.Comment: 9 pages, 7 figures, WACV'1
- …