36,902 research outputs found
Multi-View Picking: Next-best-view Reaching for Improved Grasping in Clutter
Camera viewpoint selection is an important aspect of visual grasp detection,
especially in clutter where many occlusions are present. Where other approaches
use a static camera position or fixed data collection routines, our Multi-View
Picking (MVP) controller uses an active perception approach to choose
informative viewpoints based directly on a distribution of grasp pose estimates
in real time, reducing uncertainty in the grasp poses caused by clutter and
occlusions. In trials of grasping 20 objects from clutter, our MVP controller
achieves 80% grasp success, outperforming a single-viewpoint grasp detector by
12%. We also show that our approach is both more accurate and more efficient
than approaches which consider multiple fixed viewpoints.Comment: ICRA 2019 Video: https://youtu.be/Vn3vSPKlaEk Code:
https://github.com/dougsm/mvp_gras
Recovering 6D Object Pose and Predicting Next-Best-View in the Crowd
Object detection and 6D pose estimation in the crowd (scenes with multiple
object instances, severe foreground occlusions and background distractors), has
become an important problem in many rapidly evolving technological areas such
as robotics and augmented reality. Single shot-based 6D pose estimators with
manually designed features are still unable to tackle the above challenges,
motivating the research towards unsupervised feature learning and
next-best-view estimation. In this work, we present a complete framework for
both single shot-based 6D object pose estimation and next-best-view prediction
based on Hough Forests, the state of the art object pose estimator that
performs classification and regression jointly. Rather than using manually
designed features we a) propose an unsupervised feature learnt from
depth-invariant patches using a Sparse Autoencoder and b) offer an extensive
evaluation of various state of the art features. Furthermore, taking advantage
of the clustering performed in the leaf nodes of Hough Forests, we learn to
estimate the reduction of uncertainty in other views, formulating the problem
of selecting the next-best-view. To further improve pose estimation, we propose
an improved joint registration and hypotheses verification module as a final
refinement step to reject false detections. We provide two additional
challenging datasets inspired from realistic scenarios to extensively evaluate
the state of the art and our framework. One is related to domestic environments
and the other depicts a bin-picking scenario mostly found in industrial
settings. We show that our framework significantly outperforms state of the art
both on public and on our datasets.Comment: CVPR 2016 accepted paper, project page:
http://www.iis.ee.ic.ac.uk/rkouskou/6D_NBV.htm
Feature and viewpoint selection for industrial car assembly
Abstract. Quality assurance programs of today’s car manufacturers show increasing demand for automated visual inspection tasks. A typical example is just-in-time checking of assemblies along production lines. Since high throughput must be achieved, object recognition and pose estimation heavily rely on offline preprocessing stages of available CAD data. In this paper, we propose a complete, universal framework for CAD model feature extraction and entropy index based viewpoint selection that is developed in cooperation with a major german car manufacturer
Active vision for dexterous grasping of novel objects
How should a robot direct active vision so as to ensure reliable grasping? We
answer this question for the case of dexterous grasping of unfamiliar objects.
By dexterous grasping we simply mean grasping by any hand with more than two
fingers, such that the robot has some choice about where to place each finger.
Such grasps typically fail in one of two ways, either unmodeled objects in the
scene cause collisions or object reconstruction is insufficient to ensure that
the grasp points provide a stable force closure. These problems can be solved
more easily if active sensing is guided by the anticipated actions. Our
approach has three stages. First, we take a single view and generate candidate
grasps from the resulting partial object reconstruction. Second, we drive the
active vision approach to maximise surface reconstruction quality around the
planned contact points. During this phase, the anticipated grasp is continually
refined. Third, we direct gaze to improve the safety of the planned reach to
grasp trajectory. We show, on a dexterous manipulator with a camera on the
wrist, that our approach (80.4% success rate) outperforms a randomised
algorithm (64.3% success rate).Comment: IROS 2016. Supplementary video: https://youtu.be/uBSOO6tMzw
Localization from semantic observations via the matrix permanent
Most approaches to robot localization rely on low-level geometric features such as points, lines, and planes. In this paper, we use object recognition to obtain semantic information from the robot’s sensors and consider the task of localizing the robot within a prior map of landmarks, which are annotated with semantic labels. As object recognition algorithms miss detections and produce false alarms, correct data association between the detections and the landmarks on the map is central to the semantic localization problem. Instead of the traditional vector-based representation, we propose a sensor model, which encodes the semantic observations via random finite sets and enables a unified treatment of missed detections, false alarms, and data association. Our second contribution is to reduce the problem of computing the likelihood of a set-valued observation to the problem of computing a matrix permanent. It is this crucial transformation that allows us to solve the semantic localization problem with a polynomial-time approximation to the set-based Bayes filter. Finally, we address the active semantic localization problem, in which the observer’s trajectory is planned in order to improve the accuracy and efficiency of the localization process. The performance of our approach is demonstrated in simulation and in real environments using deformable-part-model-based object detectors. Robust global localization from semantic observations is demonstrated for a mobile robot, for the Project Tango phone, and on the KITTI visual odometry dataset. Comparisons are made with the traditional lidar-based geometric Monte Carlo localization
- …