2,851 research outputs found
Deep Projective 3D Semantic Segmentation
Semantic segmentation of 3D point clouds is a challenging problem with
numerous real-world applications. While deep learning has revolutionized the
field of image semantic segmentation, its impact on point cloud data has been
limited so far. Recent attempts, based on 3D deep learning approaches
(3D-CNNs), have achieved below-expected results. Such methods require
voxelizations of the underlying point cloud data, leading to decreased spatial
resolution and increased memory consumption. Additionally, 3D-CNNs greatly
suffer from the limited availability of annotated datasets.
In this paper, we propose an alternative framework that avoids the
limitations of 3D-CNNs. Instead of directly solving the problem in 3D, we first
project the point cloud onto a set of synthetic 2D-images. These images are
then used as input to a 2D-CNN, designed for semantic segmentation. Finally,
the obtained prediction scores are re-projected to the point cloud to obtain
the segmentation results. We further investigate the impact of multiple
modalities, such as color, depth and surface normals, in a multi-stream network
architecture. Experiments are performed on the recent Semantic3D dataset. Our
approach sets a new state-of-the-art by achieving a relative gain of 7.9 %,
compared to the previous best approach.Comment: Submitted to CAIP 201
Instance-Level Salient Object Segmentation
Image saliency detection has recently witnessed rapid progress due to deep
convolutional neural networks. However, none of the existing methods is able to
identify object instances in the detected salient regions. In this paper, we
present a salient instance segmentation method that produces a saliency mask
with distinct object instance labels for an input image. Our method consists of
three steps, estimating saliency map, detecting salient object contours and
identifying salient object instances. For the first two steps, we propose a
multiscale saliency refinement network, which generates high-quality salient
region masks and salient object contours. Once integrated with multiscale
combinatorial grouping and a MAP-based subset optimization framework, our
method can generate very promising salient object instance segmentation
results. To promote further research and evaluation of salient instance
segmentation, we also construct a new database of 1000 images and their
pixelwise salient instance annotations. Experimental results demonstrate that
our proposed method is capable of achieving state-of-the-art performance on all
public benchmarks for salient region detection as well as on our new dataset
for salient instance segmentation.Comment: To appear in CVPR201
Forecasting Hands and Objects in Future Frames
This paper presents an approach to forecast future presence and location of
human hands and objects. Given an image frame, the goal is to predict what
objects will appear in the future frame (e.g., 5 seconds later) and where they
will be located at, even when they are not visible in the current frame. The
key idea is that (1) an intermediate representation of a convolutional object
recognition model abstracts scene information in its frame and that (2) we can
predict (i.e., regress) such representations corresponding to the future frames
based on that of the current frame. We design a new two-stream convolutional
neural network (CNN) architecture for videos by extending the state-of-the-art
convolutional object detection network, and present a new fully convolutional
regression network for predicting future scene representations. Our experiments
confirm that combining the regressed future representation with our detection
network allows reliable estimation of future hands and objects in videos. We
obtain much higher accuracy compared to the state-of-the-art future object
presence forecast method on a public dataset
- …