290 research outputs found
Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
We propose a real-time RGB-based pipeline for object detection and 6D pose
estimation. Our novel 3D orientation estimation is based on a variant of the
Denoising Autoencoder that is trained on simulated views of a 3D model using
Domain Randomization. This so-called Augmented Autoencoder has several
advantages over existing methods: It does not require real, pose-annotated
training data, generalizes to various test sensors and inherently handles
object and view symmetries. Instead of learning an explicit mapping from input
images to object poses, it provides an implicit representation of object
orientations defined by samples in a latent space. Our pipeline achieves
state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D
domain. We also evaluate on the LineMOD dataset where we can compete with other
synthetically trained approaches. We further increase performance by correcting
3D orientation estimates to account for perspective errors when the object
deviates from the image center and show extended results.Comment: Code available at: https://github.com/DLR-RM/AugmentedAutoencode
Learning for Free – Object Detectors Trained on Synthetic Data
A picture is worth a thousand words, or if you want it labeled, it’s worth about four cents per bounding box. Data is the fuel that powers modern technologies run by artificial intelligence engines which is increasingly valuable in today’s industry. High quality labeled data is the most important factor in producing accurate machine learning models which can be used to make powerful predictions and identify patterns humans may not see. Acquiring high quality labeled data however, can be expensive and time consuming. For small companies, academic researchers, or machine learning hobbyists, gathering large datasets for a specific task that are not already publicly available is challenging. This research paper describes the techniques used to generate labeled image data synthetically which can be used in supervised learning for object detection. Technologies such as 3D modeling software in conjunction with Generative Adversarial Networks and image augmentation can create a realistic and diverse image dataset with bounding boxes and labels. The result of our effort is an accurate object detector in an environment of aerial surveillance with no cost to the end user. We achieved a best average precision score of 0.76 to classify and detect cars from an aerial perspective using a mix of GAN-refined data along with randomized synthetic data
- …