14,014 research outputs found
Frustum PointNets for 3D Object Detection from RGB-D Data
In this work, we study 3D object detection from RGB-D data in both indoor and
outdoor scenes. While previous methods focus on images or 3D voxels, often
obscuring natural 3D patterns and invariances of 3D data, we directly operate
on raw point clouds by popping up RGB-D scans. However, a key challenge of this
approach is how to efficiently localize objects in point clouds of large-scale
scenes (region proposal). Instead of solely relying on 3D proposals, our method
leverages both mature 2D object detectors and advanced 3D deep learning for
object localization, achieving efficiency as well as high recall for even small
objects. Benefited from learning directly in raw point clouds, our method is
also able to precisely estimate 3D bounding boxes even under strong occlusion
or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection
benchmarks, our method outperforms the state of the art by remarkable margins
while having real-time capability.Comment: 15 pages, 12 figures, 14 table
T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects
We introduce T-LESS, a new public dataset for estimating the 6D pose, i.e.
translation and rotation, of texture-less rigid objects. The dataset features
thirty industry-relevant objects with no significant texture and no
discriminative color or reflectance properties. The objects exhibit symmetries
and mutual similarities in shape and/or size. Compared to other datasets, a
unique property is that some of the objects are parts of others. The dataset
includes training and test images that were captured with three synchronized
sensors, specifically a structured-light and a time-of-flight RGB-D sensor and
a high-resolution RGB camera. There are approximately 39K training and 10K test
images from each sensor. Additionally, two types of 3D models are provided for
each object, i.e. a manually created CAD model and a semi-automatically
reconstructed one. Training images depict individual objects against a black
background. Test images originate from twenty test scenes having varying
complexity, which increases from simple scenes with several isolated objects to
very challenging ones with multiple instances of several objects and with a
high amount of clutter and occlusion. The images were captured from a
systematically sampled view sphere around the object/scene, and are annotated
with accurate ground truth 6D poses of all modeled objects. Initial evaluation
results indicate that the state of the art in 6D object pose estimation has
ample room for improvement, especially in difficult cases with significant
occlusion. The T-LESS dataset is available online at cmp.felk.cvut.cz/t-less.Comment: WACV 201
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
- …