32,033 research outputs found
Saliency-guided Adaptive Seeding for Supervoxel Segmentation
We propose a new saliency-guided method for generating supervoxels in 3D
space. Rather than using an evenly distributed spatial seeding procedure, our
method uses visual saliency to guide the process of supervoxel generation. This
results in densely distributed, small, and precise supervoxels in salient
regions which often contain objects, and larger supervoxels in less salient
regions that often correspond to background. Our approach largely improves the
quality of the resulting supervoxel segmentation in terms of boundary recall
and under-segmentation error on publicly available benchmarks.Comment: 6 pages, accepted to IROS201
Sparse-to-Dense: Depth Prediction from Sparse Depth Samples and a Single Image
We consider the problem of dense depth prediction from a sparse set of depth
measurements and a single RGB image. Since depth estimation from monocular
images alone is inherently ambiguous and unreliable, to attain a higher level
of robustness and accuracy, we introduce additional sparse depth samples, which
are either acquired with a low-resolution depth sensor or computed via visual
Simultaneous Localization and Mapping (SLAM) algorithms. We propose the use of
a single deep regression network to learn directly from the RGB-D raw data, and
explore the impact of number of depth samples on prediction accuracy. Our
experiments show that, compared to using only RGB images, the addition of 100
spatially random depth samples reduces the prediction root-mean-square error by
50% on the NYU-Depth-v2 indoor dataset. It also boosts the percentage of
reliable prediction from 59% to 92% on the KITTI dataset. We demonstrate two
applications of the proposed algorithm: a plug-in module in SLAM to convert
sparse maps to dense maps, and super-resolution for LiDARs. Software and video
demonstration are publicly available.Comment: accepted to ICRA 2018. 8 pages, 8 figures, 3 tables. Video at
https://www.youtube.com/watch?v=vNIIT_M7x7Y. Code at
https://github.com/fangchangma/sparse-to-dens
Deep Depth Completion of a Single RGB-D Image
The goal of our work is to complete the depth channel of an RGB-D image.
Commodity-grade depth cameras often fail to sense depth for shiny, bright,
transparent, and distant surfaces. To address this problem, we train a deep
network that takes an RGB image as input and predicts dense surface normals and
occlusion boundaries. Those predictions are then combined with raw depth
observations provided by the RGB-D camera to solve for depths for all pixels,
including those missing in the original observation. This method was chosen
over others (e.g., inpainting depths directly) as the result of extensive
experiments with a new depth completion benchmark dataset, where holes are
filled in training data through the rendering of surface reconstructions
created from multiview RGB-D scans. Experiments with different network inputs,
depth representations, loss functions, optimization methods, inpainting
methods, and deep depth estimation networks show that our proposed approach
provides better depth completions than these alternatives.Comment: Accepted by CVPR2018 (Spotlight). Project webpage:
http://deepcompletion.cs.princeton.edu/ This version includes supplementary
materials which provide more implementation details, quantitative evaluation,
and qualitative results. Due to file size limit, please check project website
for high-res pape
3D point cloud video segmentation oriented to the analysis of interactions
Given the widespread availability of point cloud data from consumer depth sensors, 3D point cloud segmentation becomes a promising building block for high level applications such as scene understanding and interaction analysis. It benefits from the richer information contained in real world 3D data compared to 2D images. This also implies that the classical color segmentation challenges have shifted to RGBD data, and new challenges have also emerged as the depth information is usually noisy, sparse and unorganized. Meanwhile, the lack of 3D point cloud ground truth labeling also limits the development and comparison among methods in 3D point cloud segmentation. In this paper, we present two contributions: a novel graph based point cloud segmentation method for RGBD stream data with interacting objects and a new ground truth labeling for a previously published data set. This data set focuses on interaction (merge and split between ’object’ point clouds), which differentiates itself from the few existing labeled RGBD data sets which are more oriented to Simultaneous Localization And Mapping (SLAM) tasks. The proposed point cloud segmentation method is evaluated with the 3D point cloud ground truth labeling. Experiments show the promising result of our approach.Postprint (published version
Hallucinating dense optical flow from sparse lidar for autonomous vehicles
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper we propose a novel approach to estimate dense optical flow from sparse lidar data acquired on an autonomous vehicle. This is intended to be used as a drop-in replacement of any image-based optical flow system when images are not reliable due to e.g. adverse weather conditions or at night. In order to infer high resolution 2D flows from discrete range data we devise a three-block architecture of multiscale filters that combines multiple intermediate objectives, both in the lidar and image domain. To train this network we introduce a dataset with approximately 20K lidar samples of the Kitti dataset which we have augmented with a pseudo ground-truth image-based optical flow computed using FlowNet2. We demonstrate the effectiveness of our approach on Kitti, and show that despite using the low-resolution and sparse measurements of the lidar, we can regress dense optical flow maps which are at par with those estimated with image-based methods.Peer ReviewedPostprint (author's final draft
- …