3,953 research outputs found
Recovering 6D Object Pose: A Review and Multi-modal Analysis
A large number of studies analyse object detection and pose estimation at
visual level in 2D, discussing the effects of challenges such as occlusion,
clutter, texture, etc., on the performances of the methods, which work in the
context of RGB modality. Interpreting the depth data, the study in this paper
presents thorough multi-modal analyses. It discusses the above-mentioned
challenges for full 6D object pose estimation in RGB-D images comparing the
performances of several 6D detectors in order to answer the following
questions: What is the current position of the computer vision community for
maintaining "automation" in robotic manipulation? What next steps should the
community take for improving "autonomy" in robotics while handling objects? Our
findings include: (i) reasonably accurate results are obtained on
textured-objects at varying viewpoints with cluttered backgrounds. (ii) Heavy
existence of occlusion and clutter severely affects the detectors, and
similar-looking distractors is the biggest challenge in recovering instances'
6D. (iii) Template-based methods and random forest-based learning algorithms
underlie object detection and 6D pose estimation. Recent paradigm is to learn
deep discriminative feature representations and to adopt CNNs taking RGB images
as input. (iv) Depending on the availability of large-scale 6D annotated depth
datasets, feature representations can be learnt on these datasets, and then the
learnt representations can be customized for the 6D problem
ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception
Visual perception tasks often require vast amounts of labelled data,
including 3D poses and image space segmentation masks. The process of creating
such training data sets can prove difficult or time-intensive to scale up to
efficacy for general use. Consider the task of pose estimation for rigid
objects. Deep neural network based approaches have shown good performance when
trained on large, public datasets. However, adapting these networks for other
novel objects, or fine-tuning existing models for different environments,
requires significant time investment to generate newly labelled instances.
Towards this end, we propose ProgressLabeller as a method for more efficiently
generating large amounts of 6D pose training data from color images sequences
for custom scenes in a scalable manner. ProgressLabeller is intended to also
support transparent or translucent objects, for which the previous methods
based on depth dense reconstruction will fail. We demonstrate the effectiveness
of ProgressLabeller by rapidly create a dataset of over 1M samples with which
we fine-tune a state-of-the-art pose estimation network in order to markedly
improve the downstream robotic grasp success rates. ProgressLabeller is
open-source at https://github.com/huijieZH/ProgressLabeller.Comment: IROS 2022 accepted paper; project page:
https://progress.eecs.umich.edu/projects/progress-labeller
Efficient Belief Propagation for Perception and Manipulation in Clutter
Autonomous service robots are required to perform tasks in common human indoor environments. To achieve goals associated with these tasks, the robot should continually perceive, reason its environment, and plan to manipulate objects, which we term as goal-directed manipulation. Perception remains the most challenging aspect of all stages, as common indoor environments typically pose problems in recognizing objects under inherent occlusions with physical interactions among themselves. Despite recent progress in the field of robot perception, accommodating perceptual uncertainty due to partial observations remains challenging and needs to be addressed to achieve the desired autonomy.
In this dissertation, we address the problem of perception under uncertainty for robot manipulation in cluttered environments using generative inference methods. Specifically, we aim to enable robots to perceive partially observable environments by maintaining an approximate probability distribution as a belief over possible scene hypotheses. This belief representation captures uncertainty resulting from inter-object occlusions and physical interactions, which are inherently present in clutterred indoor environments. The research efforts presented in this thesis are towards developing appropriate state representations and inference techniques to generate and maintain such belief over contextually plausible scene states. We focus on providing the following features to generative inference while addressing the challenges due to occlusions: 1) generating and maintaining plausible scene hypotheses, 2) reducing the inference search space that typically grows exponentially with respect to the number of objects in a scene, 3) preserving scene hypotheses over continual observations.
To generate and maintain plausible scene hypotheses, we propose physics informed scene estimation methods that combine a Newtonian physics engine within a particle based generative inference framework. The proposed variants of our method with and without a Monte Carlo step showed promising results on generating and maintaining plausible hypotheses under complete occlusions. We show that estimating such scenarios would not be possible by the commonly adopted 3D registration methods without the notion of a physical context that our method provides.
To scale up the context informed inference to accommodate a larger number of objects, we describe a factorization of scene state into object and object-parts to perform collaborative particle-based inference. This resulted in the Pull Message Passing for Nonparametric Belief Propagation (PMPNBP) algorithm that caters to the demands of the high-dimensional multimodal nature of cluttered scenes while being computationally tractable. We demonstrate that PMPNBP is orders of magnitude faster than the state-of-the-art Nonparametric Belief Propagation method. Additionally, we show that PMPNBP successfully estimates poses of articulated objects under various simulated occlusion scenarios.
To extend our PMPNBP algorithm for tracking object states over continuous observations, we explore ways to propose and preserve hypotheses effectively over time. This resulted in an augmentation-selection method, where hypotheses are drawn from various proposals followed by the selection of a subset using PMPNBP that explained the current state of the objects. We discuss and analyze our augmentation-selection method with its counterparts in belief propagation literature. Furthermore, we develop an inference pipeline for pose estimation and tracking of articulated objects in clutter. In this pipeline, the message passing module with the augmentation-selection method is informed by segmentation heatmaps from a trained neural network. In our experiments, we show that our proposed pipeline can effectively maintain belief and track articulated objects over a sequence of observations under occlusion.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/163159/1/kdesingh_1.pd
Supervised Remote Robot with Guided Autonomy and Teleoperation (SURROGATE): A Framework for Whole-Body Manipulation
The use of the cognitive capabilities of humans to help guide the autonomy of robotics platforms in what is typically called âsupervised-autonomyâ is becoming more commonplace in robotics research. The work discussed in this paper presents an approach to a human-in-the-loop mode of robot operation that integrates high level human cognition and commanding with the intelligence and processing power of autonomous systems. Our framework for a âSupervised Remote Robot with Guided Autonomy and Teleoperationâ (SURROGATE) is demonstrated on a robotic platform consisting of a pan-tilt perception head, two 7-DOF arms connected by a single 7-DOF torso, mounted on a tracked-wheel base. We present an architecture that allows high-level supervisory commands and intents to be specified by a user that are then interpreted by the robotic system to perform whole body manipulation tasks autonomously. We use a concept of âbehaviorsâ to chain together sequences of âactionsâ for the robot to perform which is then executed real time
A framework for utility data integration in the UK
In this paper we investigate various factors which prevent utility knowledge from being
fully exploited and suggest that integration techniques can be applied to improve the
quality of utility records. The paper suggests a framework which supports knowledge
and data integration. The framework supports utility integration at two levels: the
schema and data level. Schema level integration ensures that a single, integrated geospatial
data set is available for utility enquiries. Data level integration improves utility data
quality by reducing inconsistency, duplication and conflicts. Moreover, the framework
is designed to preserve autonomy and distribution of utility data. The ultimate aim of
the research is to produce an integrated representation of underground utility infrastructure
in order to gain more accurate knowledge of the buried services. It is hoped that
this approach will enable us to understand various problems associated with utility data,
and to suggest some potential techniques for resolving them
Spacecraft Pose Estimation Based on Unsupervised Domain Adaptation and on a 3D-Guided Loss Combination
Spacecraft pose estimation is a key task to enable space missions in which
two spacecrafts must navigate around each other. Current state-of-the-art
algorithms for pose estimation employ data-driven techniques. However, there is
an absence of real training data for spacecraft imaged in space conditions due
to the costs and difficulties associated with the space environment. This has
motivated the introduction of 3D data simulators, solving the issue of data
availability but introducing a large gap between the training (source) and test
(target) domains. We explore a method that incorporates 3D structure into the
spacecraft pose estimation pipeline to provide robustness to intensity domain
shift and we present an algorithm for unsupervised domain adaptation with
robust pseudo-labelling. Our solution has ranked second in the two categories
of the 2021 Pose Estimation Challenge organised by the European Space Agency
and the Stanford University, achieving the lowest average error over the two
categories.Comment: Accepted at ECCV 2022 AI4SPACE Workshop
(https://aiforspace.github.io/2022/
- âŠ