678 research outputs found
RGB-D datasets using microsoft kinect or similar sensors: a survey
RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms
PetroSurf3D - A Dataset for high-resolution 3D Surface Segmentation
The development of powerful 3D scanning hardware and reconstruction
algorithms has strongly promoted the generation of 3D surface reconstructions
in different domains. An area of special interest for such 3D reconstructions
is the cultural heritage domain, where surface reconstructions are generated to
digitally preserve historical artifacts. While reconstruction quality nowadays
is sufficient in many cases, the robust analysis (e.g. segmentation, matching,
and classification) of reconstructed 3D data is still an open topic. In this
paper, we target the automatic and interactive segmentation of high-resolution
3D surface reconstructions from the archaeological domain. To foster research
in this field, we introduce a fully annotated and publicly available
large-scale 3D surface dataset including high-resolution meshes, depth maps and
point clouds as a novel benchmark dataset to the community. We provide baseline
results for our existing random forest-based approach and for the first time
investigate segmentation with convolutional neural networks (CNNs) on the data.
Results show that both approaches have complementary strengths and weaknesses
and that the provided dataset represents a challenge for future research.Comment: CBMI Submission; Dataset and more information can be found at
http://lrs.icg.tugraz.at/research/petroglyphsegmentation
Autonomous Robot Navigation with Rich Information Mapping in Nuclear Storage Environments
This paper presents our approach to develop a method for an unmanned ground
vehicle (UGV) to perform inspection tasks in nuclear environments using rich
information maps. To reduce inspectors' exposure to elevated radiation levels,
an autonomous navigation framework for the UGV has been developed to perform
routine inspections such as counting containers, recording their ID tags and
performing gamma measurements on some of them. In order to achieve autonomy, a
rich information map is generated which includes not only the 2D global cost
map consisting of obstacle locations for path planning, but also the location
and orientation information for the objects of interest from the inspector's
perspective. The UGV's autonomy framework utilizes this information to
prioritize locations to navigate to perform the inspections. In this paper, we
present our method of generating this rich information map, originally
developed to meet the requirements of the International Atomic Energy Agency
(IAEA) Robotics Challenge. We demonstrate the performance of our method in a
simulated testbed environment containing uranium hexafluoride (UF6) storage
container mock ups
Advances in Data-Driven Analysis and Synthesis of 3D Indoor Scenes
This report surveys advances in deep learning-based modeling techniques that
address four different 3D indoor scene analysis tasks, as well as synthesis of
3D indoor scenes. We describe different kinds of representations for indoor
scenes, various indoor scene datasets available for research in the
aforementioned areas, and discuss notable works employing machine learning
models for such scene modeling tasks based on these representations.
Specifically, we focus on the analysis and synthesis of 3D indoor scenes. With
respect to analysis, we focus on four basic scene understanding tasks -- 3D
object detection, 3D scene segmentation, 3D scene reconstruction and 3D scene
similarity. And for synthesis, we mainly discuss neural scene synthesis works,
though also highlighting model-driven methods that allow for human-centric,
progressive scene synthesis. We identify the challenges involved in modeling
scenes for these tasks and the kind of machinery that needs to be developed to
adapt to the data representation, and the task setting in general. For each of
these tasks, we provide a comprehensive summary of the state-of-the-art works
across different axes such as the choice of data representation, backbone,
evaluation metric, input, output, etc., providing an organized review of the
literature. Towards the end, we discuss some interesting research directions
that have the potential to make a direct impact on the way users interact and
engage with these virtual scene models, making them an integral part of the
metaverse.Comment: Published in Computer Graphics Forum, Aug 202
Pointwise Convolutional Neural Networks
Deep learning with 3D data such as reconstructed point clouds and CAD models
has received great research interests recently. However, the capability of
using point clouds with convolutional neural network has been so far not fully
explored. In this paper, we present a convolutional neural network for semantic
segmentation and object recognition with 3D point clouds. At the core of our
network is pointwise convolution, a new convolution operator that can be
applied at each point of a point cloud. Our fully convolutional network design,
while being surprisingly simple to implement, can yield competitive accuracy in
both semantic segmentation and object recognition task.Comment: 10 pages, 6 figures, 10 tables. Paper accepted to CVPR 201
- …