2,102 research outputs found
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
Cumulative object categorization in clutter
In this paper we present an approach based on scene- or part-graphs for geometrically categorizing touching and
occluded objects. We use additive RGBD feature descriptors and hashing of graph configuration parameters for describing the spatial arrangement of constituent parts. The presented experiments quantify that this method outperforms our earlier part-voting and sliding window classification. We evaluated our approach on cluttered scenes, and by using a 3D dataset containing over 15000 Kinect scans of over 100 objects which were grouped into general geometric categories. Additionally, color, geometric, and combined features were compared for categorization tasks
A General Framework for Flexible Multi-Cue Photometric Point Cloud Registration
The ability to build maps is a key functionality for the majority of mobile
robots. A central ingredient to most mapping systems is the registration or
alignment of the recorded sensor data. In this paper, we present a general
methodology for photometric registration that can deal with multiple different
cues. We provide examples for registering RGBD as well as 3D LIDAR data. In
contrast to popular point cloud registration approaches such as ICP our method
does not rely on explicit data association and exploits multiple modalities
such as raw range and image data streams. Color, depth, and normal information
are handled in an uniform manner and the registration is obtained by minimizing
the pixel-wise difference between two multi-channel images. We developed a
flexible and general framework and implemented our approach inside that
framework. We also released our implementation as open source C++ code. The
experiments show that our approach allows for an accurate registration of the
sensor data without requiring an explicit data association or model-specific
adaptations to datasets or sensors. Our approach exploits the different cues in
a natural and consistent way and the registration can be done at framerate for
a typical range or imaging sensor.Comment: 8 page
OctNetFusion: Learning Depth Fusion from Data
In this paper, we present a learning based approach to depth fusion, i.e.,
dense 3D reconstruction from multiple depth images. The most common approach to
depth fusion is based on averaging truncated signed distance functions, which
was originally proposed by Curless and Levoy in 1996. While this method is
simple and provides great results, it is not able to reconstruct (partially)
occluded surfaces and requires a large number frames to filter out sensor noise
and outliers. Motivated by the availability of large 3D model repositories and
recent advances in deep learning, we present a novel 3D CNN architecture that
learns to predict an implicit surface representation from the input depth maps.
Our learning based method significantly outperforms the traditional volumetric
fusion approach in terms of noise reduction and outlier suppression. By
learning the structure of real world 3D objects and scenes, our approach is
further able to reconstruct occluded regions and to fill in gaps in the
reconstruction. We demonstrate that our learning based approach outperforms
both vanilla TSDF fusion as well as TV-L1 fusion on the task of volumetric
fusion. Further, we demonstrate state-of-the-art 3D shape completion results.Comment: 3DV 2017, https://github.com/griegler/octnetfusio
- …