30 research outputs found
Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
We propose a real-time RGB-based pipeline for object detection and 6D pose
estimation. Our novel 3D orientation estimation is based on a variant of the
Denoising Autoencoder that is trained on simulated views of a 3D model using
Domain Randomization. This so-called Augmented Autoencoder has several
advantages over existing methods: It does not require real, pose-annotated
training data, generalizes to various test sensors and inherently handles
object and view symmetries. Instead of learning an explicit mapping from input
images to object poses, it provides an implicit representation of object
orientations defined by samples in a latent space. Our pipeline achieves
state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D
domain. We also evaluate on the LineMOD dataset where we can compete with other
synthetically trained approaches. We further increase performance by correcting
3D orientation estimates to account for perspective errors when the object
deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode
Recovering 6D Object Pose: A Review and Multi-modal Analysis
A large number of studies analyse object detection and pose estimation at
visual level in 2D, discussing the effects of challenges such as occlusion,
clutter, texture, etc., on the performances of the methods, which work in the
context of RGB modality. Interpreting the depth data, the study in this paper
presents thorough multi-modal analyses. It discusses the above-mentioned
challenges for full 6D object pose estimation in RGB-D images comparing the
performances of several 6D detectors in order to answer the following
questions: What is the current position of the computer vision community for
maintaining "automation" in robotic manipulation? What next steps should the
community take for improving "autonomy" in robotics while handling objects? Our
findings include: (i) reasonably accurate results are obtained on
textured-objects at varying viewpoints with cluttered backgrounds. (ii) Heavy
existence of occlusion and clutter severely affects the detectors, and
similar-looking distractors is the biggest challenge in recovering instances'
6D. (iii) Template-based methods and random forest-based learning algorithms
underlie object detection and 6D pose estimation. Recent paradigm is to learn
deep discriminative feature representations and to adopt CNNs taking RGB images
as input. (iv) Depending on the availability of large-scale 6D annotated depth
datasets, feature representations can be learnt on these datasets, and then the
learnt representations can be customized for the 6D problem
Recommended from our members
Diversity for Safety and Security in Embedded Systems
We present ongoing work about how security and safety properties in embedded systems are affected by redundancy and diversity. The need to consider security requirements in the presence of malicious action creates additional design trade-offs besides those familiar in the design of safety critical and highly reliable systems. We outline the motivation for this work, an industrial case study, and the research direction we have taken
Single-Image Depth Prediction Makes Feature Matching Easier
Good local features improve the robustness of many 3D re-localization and
multi-view reconstruction pipelines. The problem is that viewing angle and
distance severely impact the recognizability of a local feature. Attempts to
improve appearance invariance by choosing better local feature points or by
leveraging outside information, have come with pre-requisites that made some of
them impractical. In this paper, we propose a surprisingly effective
enhancement to local feature extraction, which improves matching. We show that
CNN-based depths inferred from single RGB images are quite helpful, despite
their flaws. They allow us to pre-warp images and rectify perspective
distortions, to significantly enhance SIFT and BRISK features, enabling more
good matches, even when cameras are looking at the same scene but in opposite
directions.Comment: 14 pages, 7 figures, accepted for publication at the European
conference on computer vision (ECCV) 202
Evidence-based nanoscopic and molecular framework for excipient functionality in compressed orally disintegrating tablets
The work investigates the adhesive/cohesive molecular and physical interactions together with nanoscopic features of commonly used orally disintegrating tablet (ODT) excipients microcrystalline cellulose (MCC) and D-mannitol. This helps to elucidate the underlying physico-chemical and mechanical mechanisms responsible for powder densification and optimum product functionality. Atomic force microscopy (AFM) contact mode analysis was performed to measure nano-adhesion forces and surface energies between excipient-drug particles (6-10 different particles per each pair). Moreover, surface topography images (100 nm2-10 Όm2) and roughness data were acquired from AFM tapping mode. AFM data were related to ODT macro/microscopic properties obtained from SEM, FTIR, XRD, thermal analysis using DSC and TGA, disintegration testing, Heckel and tabletability profiles. The study results showed a good association between the adhesive molecular and physical forces of paired particles and the resultant densification mechanisms responsible for mechanical strength of tablets. MCC micro roughness was 3 times that of D-mannitol which explains the high hardness of MCC ODTs due to mechanical interlocking. Hydrogen bonding between MCC particles could not be established from both AFM and FTIR solid state investigation. On the contrary, D-mannitol produced fragile ODTs due to fragmentation of surface crystallites during compression attained from its weak crystal structure. Furthermore, AFM analysis has shown the presence of extensive micro fibril structures inhabiting nano pores which further supports the use of MCC as a disintegrant. Overall, excipients (and model drugs) showed mechanistic behaviour on the nano/micro scale that could be related to the functionality of materials on the macro scale. © 2014 Al-khattawi et al
Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
We propose a real-time RGB-based pipeline for object detection and 6D pose estimation. Our novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization. This so-called Augmented Autoencoder has several advantages over existing methods: It does not require real, pose-annotated training data, generalizes to various test sensors and inherently handles object and view symmetries. Instead of learning an explicit mapping from input images to object poses, it provides an implicit representation of object orientations defined by samples in a latent space. Experiments on the T-LESS and LineMOD datasets show that our method outperforms similar model-based approaches and competes with state-of-the art approaches that require real pose-annotated images