1,047 research outputs found
Cumulative object categorization in clutter
In this paper we present an approach based on scene- or part-graphs for geometrically categorizing touching and
occluded objects. We use additive RGBD feature descriptors and hashing of graph configuration parameters for describing the spatial arrangement of constituent parts. The presented experiments quantify that this method outperforms our earlier part-voting and sliding window classification. We evaluated our approach on cluttered scenes, and by using a 3D dataset containing over 15000 Kinect scans of over 100 objects which were grouped into general geometric categories. Additionally, color, geometric, and combined features were compared for categorization tasks
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
Recognizing Objects In-the-wild: Where Do We Stand?
The ability to recognize objects is an essential skill for a robotic system
acting in human-populated environments. Despite decades of effort from the
robotic and vision research communities, robots are still missing good visual
perceptual systems, preventing the use of autonomous agents for real-world
applications. The progress is slowed down by the lack of a testbed able to
accurately represent the world perceived by the robot in-the-wild. In order to
fill this gap, we introduce a large-scale, multi-view object dataset collected
with an RGB-D camera mounted on a mobile robot. The dataset embeds the
challenges faced by a robot in a real-life application and provides a useful
tool for validating object recognition algorithms. Besides describing the
characteristics of the dataset, the paper evaluates the performance of a
collection of well-established deep convolutional networks on the new dataset
and analyzes the transferability of deep representations from Web images to
robotic data. Despite the promising results obtained with such representations,
the experiments demonstrate that object classification with real-life robotic
data is far from being solved. Finally, we provide a comparative study to
analyze and highlight the open challenges in robot vision, explaining the
discrepancies in the performance
- …