Search CORE

9 research outputs found

Multi-path Learning for Object Pose Estimation Across Domains

Author: Arras Kai O.
Durner Maximilian
Marton Zoltan-Csaba
Puang En Yen
Sundermeyer Martin
Triebel Rudolph
Vaskevicius Narunas
Publication venue
Publication date: 03/04/2020
Field of study

We introduce a scalable approach for object pose estimation trained on simulated RGB views of multiple 3D models together. We learn an encoding of object views that does not only describe an implicit orientation of all objects seen during training, but can also relate views of untrained objects. Our single-encoder-multi-decoder network is trained using a technique we denote "multi-path learning": While the encoder is shared by all objects, each decoder only reconstructs views of a single object. Consequently, views of different instances do not have to be separated in the latent space and can share common features. The resulting encoder generalizes well from synthetic to real data and across various instances, categories, model types and datasets. We systematically investigate the learned encodings, their generalization, and iterative refinement strategies on the ModelNet40 and T-LESS dataset. Despite training jointly on multiple objects, our 6D Object Detection pipeline achieves state-of-the-art results on T-LESS at much lower runtimes than competing approaches.Comment: To appear at CVPR 2020; Code will be available here: https://github.com/DLR-RM/AugmentedAutoencoder/tree/multipat

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Multi-Path Learning for Object Pose Estimation Across Domains

Author: Durner Maximilian
Kai O. Arras
Marton Zoltan-Csaba
Puang En Yen
Sundermeyer Martin
Triebel Rudolph
Vaskevicius Narunas
Publication venue
Publication date: 01/06/2020
Field of study

We introduce a scalable approach for object pose estima-tion trained on simulated RGB views of multiple 3D modelstogether. We learn an encoding of object views that doesnot only describe an implicit orientation of all objects seenduring training, but can also relate views of untrained ob-jects. Our single-encoder-multi-decoder network is trainedusing a technique we denote multi-path learning: Whilethe encoder is shared by all objects, each decoder only re-constructs views of a single object. Consequently, viewsof different instances do not have to be separated in thelatent space and can share common features. The result-ing encoder generalizes well from synthetic to real dataand across various instances, categories, model types anddatasets. We systematically investigate the learned encod-ings, their generalization, and iterative refinement strate-gies on the ModelNet40 and T-LESS dataset. Despite train-ing jointly on multiple objects, our 6D Object Detectionpipeline achieves state-of-the-art results on T-LESS at muchlower runtimes than competing approaches

Institute of Transport Research:Publications

CAD2Real: Deep learning with domain randomization of CAD data for 3D pose estimation of electronic control unit housings

Author: Baeuerle Simon
Barth Jonas
de Menezes Elton Renato Tavares
Mikut Ralf
Steimer Andreas
Publication venue
Publication date: 25/09/2020
Field of study

Electronic control units (ECUs) are essential for many automobile components, e.g. engine, anti-lock braking system (ABS), steering and airbags. For some products, the 3D pose of each single ECU needs to be determined during series production. Deep learning approaches can not easily be applied to this problem, because labeled training data is not available in sufficient numbers. Thus, we train state-of-the-art artificial neural networks (ANNs) on purely synthetic training data, which is automatically created from a single CAD file. By randomizing parameters during rendering of training images, we enable inference on RGB images of a real sample part. In contrast to classic image processing approaches, this data-driven approach poses only few requirements regarding the measurement setup and transfers to related use cases with little development effort.Comment: Proc. 30. Workshop Computational Intelligence, Berlin, 202

arXiv.org e-Print Archive

KITopen

FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism

Author: Chang Hyung Jin
Chen Wei
Duan Jinming
Jia Xi
Leonardis Ales
Shen Linlin
Publication venue
Publication date: 06/06/2021
Field of study

In this paper, we focus on category-level 6D pose and size estimation from monocular RGB-D image. Previous methods suffer from inefficient category-level pose feature extraction which leads to low accuracy and inference speed. To tackle this problem, we propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation. First, we design an orientation aware autoencoder with 3D graph convolution for latent feature extraction. The learned latent feature is insensitive to point shift and object size thanks to the shift and scale-invariance properties of the 3D graph convolution. Then, to efficiently decode category-level rotation information from the latent feature, we propose a novel decoupled rotation mechanism that employs two decoders to complementarily access the rotation information. Meanwhile, we estimate translation and size by two residuals, which are the difference between the mean of object points and ground truth translation, and the difference between the mean size of the category and ground truth size, respectively. Finally, to increase the generalization ability of FS-Net, we propose an online box-cage based 3D deformation mechanism to augment the training data. Extensive experiments on two benchmark datasets show that the proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation. Especially in category-level pose estimation, without extra synthetic data, our method outperforms existing methods by 6.3% on the NOCS-REAL dataset.Comment: accepted by CVPR2021, ora

arXiv.org e-Print Archive

University of Birmingham Research Portal

Few-shot Domain Adaptation for 3D Human Pose and Shape Estimation

Author: Jeong Uyoung
Publication venue: Ulsan National Institute of Science and Technology
Publication date: 01/08/2021
Field of study

Department of Computer Science and EngineeringDespite recent advancements in monocular 3D human pose and shape estimation, many previous works are susceptible to the domain gap between the training data and the test data. This problem become even more severe when the test samples are from challenging in-the-wild scenarios. This paper proposes a domain adaptation approach to mitigate the gap especially in few-shot test environment, utilizing (1) continuous metric loss to constrain the feature space distance relationships between different poses, and (2) segmentation module to localize foreground area so that negative effects from noisy background can be mitigated. Our method achieved slight improvement compared to the baseline on MPI-INF-3DHP and 3DPW datasets.ope

ScholarWorks@UNIST

Vision-guided Grasping of Arbitrary Objects through Experience-based Search Optimization

Author: Risch David Lennart
Publication venue
Publication date: 01/09/2021
Field of study

A desired capability for human-robot collaboration is the hand over of tools and object parts in a functional, effective way. Therefore, a robot has to grasp objects at specific spots, also called functional grasps. In this thesis an approach for such functional grasping of nearly unknown, arbitrary objects is presented. A differentiating factor to other approaches is the lack of a requirement for 3D models of the objects. Given a camera mounted on the robot's end-effector, the grasping position is defined by a single target image with human guidance. During execution, an iterative search for the target viewing angle is performed, based on an appearance similarity measure generated by an Auto-Encoder (AE) specifically trained to encode general object rotation. Further, a process for the fusion of data from previous grasping attempts is presented. This increases robustness and search efficiency by utilizing experience from previous executions. Additionally, a detection module is integrated, which enables the grasping of the target object in cluttered scenes. The developed method is evaluated in a simulation and on a real robotic platform. It can be shown, that the presented method is able to robustly find the pre-defined target orientation to grasp the objects

Institute of Transport Research:Publications