Search CORE

303 research outputs found

LabelFusion: A Pipeline for Generating Ground Truth Labels for Real RGBD Data of Cluttered Scenes

Author: Florence Peter R.
Manuelli Lucas
Marion Pat
Tedrake Russ
Publication venue
Publication date: 26/09/2017
Field of study

Deep neural network (DNN) architectures have been shown to outperform traditional pipelines for object segmentation and pose estimation using RGBD data, but the performance of these DNN pipelines is directly tied to how representative the training data is of the true data. Hence a key requirement for employing these methods in practice is to have a large set of labeled data for your specific robotic manipulation task, a requirement that is not generally satisfied by existing datasets. In this paper we develop a pipeline to rapidly generate high quality RGBD data with pixelwise labels and object poses. We use an RGBD camera to collect video of a scene from multiple viewpoints and leverage existing reconstruction techniques to produce a 3D dense reconstruction. We label the 3D reconstruction using a human assisted ICP-fitting of object meshes. By reprojecting the results of labeling the 3D scene we can produce labels for each RGBD image of the scene. This pipeline enabled us to collect over 1,000,000 labeled object instances in just a few days. We use this dataset to answer questions related to how much training data is required, and of what quality the data must be, to achieve high performance from a DNN architecture

arXiv.org e-Print Archive

Crossref

RGB-D datasets using microsoft kinect or similar sensors: a survey

Author: Galili
Guan
Hu
Kolner
Mulvad
Nakazawa
Palushani
Palushani
Publication venue: Springer
Publication date: 01/01/2015
Field of study

RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

Northumbria Research Link

Crossref

Springer - Publisher Connector

Online Research Database In Technology

Co-Fusion: Real-time Segmentation, Tracking and Fusion of Multiple Objects

Author: Agapito Lourdes
Rünz Martin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/06/2017
Field of study

In this paper we introduce Co-Fusion, a dense SLAM system that takes a live stream of RGB-D images as input and segments the scene into different objects (using either motion or semantic cues) while simultaneously tracking and reconstructing their 3D shape in real time. We use a multiple model fitting approach where each object can move independently from the background and still be effectively tracked and its shape fused over time using only the information from pixels associated with that object label. Previous attempts to deal with dynamic scenes have typically considered moving regions as outliers, and consequently do not model their shape or track their motion over time. In contrast, we enable the robot to maintain 3D models for each of the segmented objects and to improve them over time through fusion. As a result, our system can enable a robot to maintain a scene description at the object level which has the potential to allow interactions with its working environment; even in the case of dynamic scenes.Comment: International Conference on Robotics and Automation (ICRA) 2017, http://visual.cs.ucl.ac.uk/pubs/cofusion, https://github.com/martinruenz/co-fusio

arXiv.org e-Print Archive

Crossref

UCL Discovery

Deformable Objects for Virtual Environments

Author: Taylor Catherine
Publication venue
Publication date: 11/10/2021
Field of study

OPUS

Cumulative object categorization in clutter

Author: Balint-Benczedi Ferenc
Beetz Michael
Martinez Mozos Oscar
Marton Zoltan-Csaba
Pangercic Dejan
Publication venue: ACIN: Automation and Control Institute, University of Technology, Vienna, Austria)
Publication date: 27/06/2013
Field of study

In this paper we present an approach based on scene- or part-graphs for geometrically categorizing touching and occluded objects. We use additive RGBD feature descriptors and hashing of graph conﬁguration parameters for describing the spatial arrangement of constituent parts. The presented experiments quantify that this method outperforms our earlier part-voting and sliding window classiﬁcation. We evaluated our approach on cluttered scenes, and by using a 3D dataset containing over 15000 Kinect scans of over 100 objects which were grouped into general geometric categories. Additionally, color, geometric, and combined features were compared for categorization tasks

University of Lincoln Institutional Repository

Institute of Transport Research:Publications

3D Shape Recovery of Deformable Soft-tissue with Computed Tomography and Depth Scan

Author: Dissanayake G
Huang S
Song JW
Wang J
Zhao L
Publication venue: 'Fakultas Teknologi Kedirgantaraan'
Publication date: 01/01/2016
Field of study

Knowing the tissue environment accurately is very important in minimal invasive surgery (MIS). While, as the soft-tissues is deformable, reconstruction of the soft-tissues environment is challenging. This paper proposes a new framework for recovering the deformation of the soft-tissues by using a single depth sensor. This framework makes use of the morphology information of the soft-tissues from Xray computed tomography, and deforms it by the embedded deformation method. Here, the key is to build a distance field function of the scan from the depth sensor, which can be used to perform accurate model-to-scan deformation together with robust non-rigid shape registration in the same go. Simulations show that soft-tissue shape in the previous step can be ef- ficiently deformed to fit the partially observed scan in the current step by using the proposed method. And the results from the simulated sequential deformation of three different softtissues demonstrate the potential clinical value for MIS

OPUS - University of Technology Sydney