Search CORE

1,780 research outputs found

T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects

Author: Haluza Pavel
Hodan Tomas
Lourakis Manolis
Matas Jiri
Obdrzalek Stepan
Zabulis Xenophon
Publication venue
Publication date: 19/01/2017
Field of study

We introduce T-LESS, a new public dataset for estimating the 6D pose, i.e. translation and rotation, of texture-less rigid objects. The dataset features thirty industry-relevant objects with no significant texture and no discriminative color or reflectance properties. The objects exhibit symmetries and mutual similarities in shape and/or size. Compared to other datasets, a unique property is that some of the objects are parts of others. The dataset includes training and test images that were captured with three synchronized sensors, specifically a structured-light and a time-of-flight RGB-D sensor and a high-resolution RGB camera. There are approximately 39K training and 10K test images from each sensor. Additionally, two types of 3D models are provided for each object, i.e. a manually created CAD model and a semi-automatically reconstructed one. Training images depict individual objects against a black background. Test images originate from twenty test scenes having varying complexity, which increases from simple scenes with several isolated objects to very challenging ones with multiple instances of several objects and with a high amount of clutter and occlusion. The images were captured from a systematically sampled view sphere around the object/scene, and are annotated with accurate ground truth 6D poses of all modeled objects. Initial evaluation results indicate that the state of the art in 6D object pose estimation has ample room for improvement, especially in difficult cases with significant occlusion. The T-LESS dataset is available online at cmp.felk.cvut.cz/t-less.Comment: WACV 201

arXiv.org e-Print Archive

Crossref

Human Detection by Fourier descriptors and Fuzzy Color Histograms with Fuzzy c-means method

Author: Akimoto Shohei
Aoyagi Seiji
Arai Yasuhiko
Suzuki Masato
Takahashi Tomokazu
新井泰彦
秋本翔平
鈴木昌人
青柳誠司
高橋智一
Publication venue: 'Fuji Technology Press Ltd.'
Publication date: 20/08/2016
Field of study

It is difficult to use histograms of oriented gradients (HOG) or other gradient-based features to detect persons in outdoor environments given that the background or scale undergoes considerable changes. This study involved the segmentation of depth images. Additionally, P-type Fourier descriptors were extracted as shape features from two-dimensional coordinates of a contour in the segmentation domains. With respect to the P-type Fourier descriptors, a person detector was created with the fuzzy c-means method (for general person detection). Furthermore, a fuzzy color histogram was extracted in terms of color features from the RGB values of the domain surface. With respect to the fuzzy color histogram, a detector of a person wearing specific clothes was created with the fuzzy c-means method (specific person detection). The study includes the following characteristics: 1) The general person detection requires less number of images used for learning and is robust against a change in the scale when compared to that in cases in which HOG or other methods are used. 2) The specific person detection gives results close to those obtained by human color vision when compared to the color indices such as RGB or CIEDE. This method was applied for a person search application at the Tsukuba Challenge, and the obtained results confirmed the effectiveness of the proposed method.A part of the study was financially supported by Promotion Grant for Higher Education and Resech 2014 at Kansai University under the title "Tsukuba Challenge and RoboCup @ Home."平成26年度関西大学教育研究高度化促進

Kansai University Repository

Semantic Visual Localization

Author: Geiger Andreas
Pollefeys Marc
Sattler Torsten
Schönberger Johannes L.
Publication venue
Publication date: 01/01/2018
Field of study

Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes

arXiv.org e-Print Archive

MPG.PuRe

HouseCat6D -- A Large-Scale Multi-Modal Category Level 6D Object Pose Dataset with Household Objects in Realistic Scenarios

Author: Busam Benjamin
Garattoni Lorenzo
Jung HyunJun
Meier Sven
Navab Nassir
Rizzoli Giulia
Roth Daniel
Ruhkamp Patrick
Schieber Hannah
Wang Pengyuan
Wu Shun-Cheng
Zhai Guangyao
Zhao Hongcheng
Publication venue
Publication date: 26/04/2023
Field of study

Estimating the 6D pose of objects is a major 3D computer vision problem. Since the promising outcomes from instance-level approaches, research heads also move towards category-level pose estimation for more practical application scenarios. However, unlike well-established instance-level pose datasets, available category-level datasets lack annotation quality and provided pose quantity. We propose the new category-level 6D pose dataset HouseCat6D featuring 1) Multi-modality of Polarimetric RGB and Depth (RGBD+P), 2) Highly diverse 194 objects of 10 household object categories including 2 photometrically challenging categories, 3) High-quality pose annotation with an error range of only 1.35 mm to 1.74 mm, 4) 41 large-scale scenes with extensive viewpoint coverage and occlusions, 5) Checkerboard-free environment throughout the entire scene, and 6) Additionally annotated dense 6D parallel-jaw grasps. Furthermore, we also provide benchmark results of state-of-the-art category-level pose estimation networks

arXiv.org e-Print Archive

A perception pipeline exploiting trademark databases for service robots

Author: Song Joshua
Publication venue: 'University of Queensland Library'
Publication date: 14/02/2020
Field of study

University of Queensland eSpace