1,059 research outputs found
Fast Object Learning and Dual-arm Coordination for Cluttered Stowing, Picking, and Packing
Robotic picking from cluttered bins is a demanding task, for which Amazon
Robotics holds challenges. The 2017 Amazon Robotics Challenge (ARC) required
stowing items into a storage system, picking specific items, and packing them
into boxes. In this paper, we describe the entry of team NimbRo Picking. Our
deep object perception pipeline can be quickly and efficiently adapted to new
items using a custom turntable capture system and transfer learning. It
produces high-quality item segments, on which grasp poses are found. A planning
component coordinates manipulation actions between two robot arms, minimizing
execution time. The system has been demonstrated successfully at ARC, where our
team reached second places in both the picking task and the final stow-and-pick
task. We also evaluate individual components.Comment: In: Proceedings of the International Conference on Robotics and
Automation (ICRA) 201
CAD2Render: A Modular Toolkit for GPU-accelerated Photorealistic Synthetic Data Generation for the Manufacturing Industry
The use of computer vision for product and assembly quality control is
becoming ubiquitous in the manufacturing industry. Lately, it is apparent that
machine learning based solutions are outperforming classical computer vision
algorithms in terms of performance and robustness. However, a main drawback is
that they require sufficiently large and labeled training datasets, which are
often not available or too tedious and too time consuming to acquire. This is
especially true for low-volume and high-variance manufacturing. Fortunately, in
this industry, CAD models of the manufactured or assembled products are
available. This paper introduces CAD2Render, a GPU-accelerated synthetic data
generator based on the Unity High Definition Render Pipeline (HDRP). CAD2Render
is designed to add variations in a modular fashion, making it possible for high
customizable data generation, tailored to the needs of the industrial use case
at hand. Although CAD2Render is specifically designed for manufacturing use
cases, it can be used for other domains as well. We validate CAD2Render by
demonstrating state of the art performance in two industrial relevant setups.
We demonstrate that the data generated by our approach can be used to train
object detection and pose estimation models with a high enough accuracy to
direct a robot. The code for CAD2Render is available at
https://github.com/EDM-Research/CAD2Render.Comment: Accepted at the Workshop on Photorealistic Image and Environment
Synthesis for Computer Vision (PIES-CV) at WACV2
Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps
Hyperspectral cameras can provide unique spectral signatures for consistently
distinguishing materials that can be used to solve surveillance tasks. In this
paper, we propose a novel real-time hyperspectral likelihood maps-aided
tracking method (HLT) inspired by an adaptive hyperspectral sensor. A moving
object tracking system generally consists of registration, object detection,
and tracking modules. We focus on the target detection part and remove the
necessity to build any offline classifiers and tune a large amount of
hyperparameters, instead learning a generative target model in an online manner
for hyperspectral channels ranging from visible to infrared wavelengths. The
key idea is that, our adaptive fusion method can combine likelihood maps from
multiple bands of hyperspectral imagery into one single more distinctive
representation increasing the margin between mean value of foreground and
background pixels in the fused map. Experimental results show that the HLT not
only outperforms all established fusion methods but is on par with the current
state-of-the-art hyperspectral target tracking frameworks.Comment: Accepted at the International Conference on Computer Vision and
Pattern Recognition Workshops, 201
Learning Multi-step Robotic Manipulation Tasks through Visual Planning
Multi-step manipulation tasks in unstructured environments are extremely challenging for a robot to learn. Such tasks interlace high-level reasoning that consists of the expected states that can be attained to achieve an overall task and low-level reasoning that decides what actions will yield these states. A model-free deep reinforcement learning method is proposed to learn multi-step manipulation tasks. This work introduces a novel Generative Residual Convolutional Neural Network (GR-ConvNet) model that can generate robust antipodal grasps from n-channel image input at real-time speeds (20ms). The proposed model architecture achieved a state-of-the-art accuracy on three standard grasping datasets. The adaptability of the proposed approach is demonstrated by directly transferring the trained model to a 7 DoF robotic manipulator with a grasp success rate of 95.4% and 93.0% on novel household and adversarial objects, respectively. A novel Robotic Manipulation Network (RoManNet) is introduced, which is a vision-based model architecture, to learn the action-value functions and predict manipulation action candidates. A Task Progress based Gaussian (TPG) reward function is defined to compute the reward based on actions that lead to successful motion primitives and progress towards the overall task goal. To balance the ratio of exploration/exploitation, this research introduces a Loss Adjusted Exploration (LAE) policy that determines actions from the action candidates according to the Boltzmann distribution of loss estimates. The effectiveness of the proposed approach is demonstrated by training RoManNet to learn several challenging multi-step robotic manipulation tasks in both simulation and real-world. Experimental results show that the proposed method outperforms the existing methods and achieves state-of-the-art performance in terms of success rate and action efficiency. The ablation studies show that TPG and LAE are especially beneficial for tasks like multiple block stacking
- …