16 research outputs found
A Multi-body Tracking Framework -- From Rigid Objects to Kinematic Structures
Kinematic structures are very common in the real world. They range from
simple articulated objects to complex mechanical systems. However, despite
their relevance, most model-based 3D tracking methods only consider rigid
objects. To overcome this limitation, we propose a flexible framework that
allows the extension of existing 6DoF algorithms to kinematic structures. Our
approach focuses on methods that employ Newton-like optimization techniques,
which are widely used in object tracking. The framework considers both
tree-like and closed kinematic structures and allows a flexible configuration
of joints and constraints. To project equations from individual rigid bodies to
a multi-body system, Jacobians are used. For closed kinematic chains, a novel
formulation that features Lagrange multipliers is developed. In a detailed
mathematical proof, we show that our constraint formulation leads to an exact
kinematic solution and converges in a single iteration. Based on the proposed
framework, we extend ICG, which is a state-of-the-art rigid object tracking
algorithm, to multi-body tracking. For the evaluation, we create a
highly-realistic synthetic dataset that features a large number of sequences
and various robots. Based on this dataset, we conduct a wide variety of
experiments that demonstrate the excellent performance of the developed
framework and our multi-body tracker.Comment: Submitted to IEEE Transactions on Pattern Analysis and Machine
Intelligenc
Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
Accurate object segmentation is a crucial task in the context of robotic
manipulation. However, creating sufficient annotated training data for neural
networks is particularly time consuming and often requires manual labeling. To
this end, we propose a simple, yet robust solution for learning to segment
unknown objects grasped by a robot. Specifically, we exploit motion and
temporal cues in RGB video sequences. Using optical flow estimation we first
learn to predict segmentation masks of our given manipulator. Then, these
annotations are used in combination with motion cues to automatically
distinguish between background, manipulator and unknown, grasped object. In
contrast to existing systems our approach is fully self-supervised and
independent of precise camera calibration, 3D models or potentially imperfect
depth data. We perform a thorough comparison with alternative baselines and
approaches from literature. The object masks and views are shown to be suitable
training data for segmentation networks that generalize to novel environments
and also allow for watertight 3D reconstruction.Comment: 15 pages, 11 figures. Video:
https://www.youtube.com/watch?v=srEwuuIIgz
"What's This?" -- Learning to Segment Unknown Objects from Manipulation Sequences
We present a novel framework for self-supervised grasped object segmentation
with a robotic manipulator. Our method successively learns an agnostic
foreground segmentation followed by a distinction between manipulator and
object solely by observing the motion between consecutive RGB frames. In
contrast to previous approaches, we propose a single, end-to-end trainable
architecture which jointly incorporates motion cues and semantic knowledge.
Furthermore, while the motion of the manipulator and the object are substantial
cues for our algorithm, we present means to robustly deal with distraction
objects moving in the background, as well as with completely static scenes. Our
method neither depends on any visual registration of a kinematic robot or 3D
object models, nor on precise hand-eye calibration or any additional sensor
data. By extensive experimental evaluation we demonstrate the superiority of
our framework and provide detailed insights on its capability of dealing with
the aforementioned extreme cases of motion. We also show that training a
semantic segmentation network with the automatically labeled data achieves
results on par with manually annotated training data. Code and pretrained
models will be made publicly available.Comment: 8 pages, 6 figure
Towards Robust Perception of Unknown Objects in the Wild
To be able to interact in dynamic and cluttered environments, detection and instance segmentation of only known objects is often not sufficient. Our recently proposed Instance Stereo Transformer (INSTR) addresses this problem by yielding pixel-wise instance masks of unknown items on dominant horizontal surfaces without requiring potentially noisy depth maps. To further boost the application of INSTR in a robotic domain, we propose two improvements: First, we extend the network to semantically label all non-object pixels, and experimentally validate that the additional explicit semantic information further enhances the object instance predictions. Second, knowledge about some detected objects might often readily be available, and we utilize Dropout as approximation of Bayesian inference to robustly classify the detected instances into known and unknown categories. The overall framework is well suited for various robotic applications, e.g. stone segmentation in planetary environments or in an unknown object grasping setting
ReSyRIS: A Real-Synthetic Rock Instance Segmentation Dataset for Training and Benchmarking
The exploration of our solar system for understanding its creation and investigating potential chances of life on other celestial bodies is a fundamental drive of human mankind. After early telescope-based observation, Apollo 11 was the first space mission able to collect samples on the lunar surface and take them back to earth for analysis. Especially in recent years this trend accelerates again, and many successors were (or are in the process of being) launched into space for extra-terrestrial sample extraction. Yet, the abundance of potential failures makes these missions extremely challenging. For operations aimed at deeper parts of the solar system, the operational working distance extends even further, and communication delay and limited bandwidth increase complexity. Consequently, sample extraction missions are designed to be more autonomous in order to carry out large parts without human intervention. One specific sub-task particularly suitable for automation is the identification of relevant extraction candidates. While there exists several approaches for rock sample identification, there are often limiting factors in the form of applicable training data, lack of suitable annotations of the very same, and unclear performance of the algorithms in extra-terrestrial environments because of inadequate test data. To address these issues, we present ReSyRIS (Real-Synthetic Rock Instance Segmentation Dataset), which consists of real-world images together with their manually created synthetic counterpart. The real-world part is collected in a quasi-extra-terrestrial environment on Mt. Etna in Sicily, and focuses recordings of several rock sample sites. Every scene is re-created in OAISYS, a Blender-based data generation pipeline for unstructured outdoor environments, for which the required meshes and textures are extracted from the volcano site. This allows not only precise re-construction of the scenes in a synthetic environment, but also generation of highly realistic training data with automatic annotations in similar fashion to the real recordings. We finally investigate the generalization capability of a neural network trained on incrementally altered versions of synthetic data to explore potential sim-to-real gaps. The real-world dataset together with the OAISYS config files to create its synthetic counterpart are publicly available at https://rm.dlr.de/resyris_en. With this novel benchmark on extra-terrestrial stone instance segmentation we hope to further push the boundaries of autonomous rock sample extraction
Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction
Autonomous Rock Instance Segmentation for Extra-Terrestrial Robotic Missions
The collection and analysis of extra-terrestrial matter
are two of the main motivations for space exploration missions.
Due to the inherent risks for participating astronauts during
space missions, autonomous robotic systems are often consid-
ered as a promising alternative. In recent years, many (in-
ter)national space missions containing rovers to explore celestial
bodies have been launched. Hereby, the communication delay as
well as limited bandwidth creates a need for highly self-governed
agents that require only infrequent interaction with scientists at
a ground station. Such a setting is explored in the ARCHES mis-
sion, which seeks to investigate different means of collaboration
between scientists and autonomous robots in extra-terrestrial
environments. The analog mission focuses a team of hetero-
geneous agents (two Lightweight Rover Units and ARDEA, a
drone), which together perform various complex tasks under
strict communication constraints. In this paper, we highlight
three of these tasks that were successfully demonstrated during
a one-month test mission on Mt. Etna in Sicily, Italy, which was
chosen due to its similarity to the Moon in terms of geological
structure. All three tasks have in common, that they leverage an
instance segmentation approach deployed on the rovers to detect
rocks within camera imagery. The first application is a map-
ping scheme that incorporates semantically detected rocks into
its environment model to safely navigate to points of interest.
Secondly, we present a method for the collection and extraction of in-situ samples with a rover, which uses rock detection to localize relevant candidates to grasp. For the third task, we show the usefulness of stone segmentation to autonomously conduct a spectrometer measurement experiment. We perform a throughout analysis of the presented methods and evaluate our experimental results. The demonstrations on Mt. Etna show that our approaches are well suited for navigation, geological analysis, and sample extraction tasks within autonomous robotic extra-terrestrial missions
Uncertainty Estimation for Planetary Robotic Terrain Segmentation
Terrain Segmentation information is crucial input for current and future planetary robotic missions. Labeling training data for terrain segmentation is a difficult task and can often cause semantic ambiguity. As a result, large portion of an image usually remains unlabeled. Therefore, it is difficult to evaluate network performance on such regions. Worse is the problem of using such a network for inference, since the quality of predictions cannot be guaranteed if trained with a standard semantic segmentation network. This can be very dangerous for real autonomous robotic missions since the network could predict any of the classes in a particular region, and the robot does not know how much of the prediction to trust. To overcome this issue, we investigate the benefits of uncertainty estimation for terrain segmentation. Knowing how certain the network is about its prediction is an important element for a robust autonomous navigation. In this paper, we present neural networks, which not only give a terrain segmentation prediction, but also an uncertainty estimation. We compare the different methods on the publicly released real world Mars data from the MSL mission
BlenderProc2: A Procedural Pipeline for Photorealistic Rendering
BlenderProc2 is a procedural pipeline that can render realistic images for the training of neural networks. Our pipeline can be employed in various use cases, including segmentation, depth, normal and pose estimation, and many others. A key feature of our Blender extension is the simple-to-use python API, designed to be easily extendable. Furthermore, many public datasets, such as 3D FRONT (Fu et al., 2021) or Shapenet (Chang et al., 2015), are already supported, making it easier to clutter synthetic scenes with additional objects
Mobile Manipulation of a Laser-induced Breakdown Spectrometer for Planetary Exploration
Laser-induced Breakdown Spectrometry (LIBS) is an established analytical technique to measure the elemental composition of rocks and other matter on the Martian surface. We propose an autonomous in-contact sampling method based on an attachable LIBS instrument, designed to measure the composition of samples on the surface of planets and moons. The spectrometer module is picked up by our Lightweight Rover Unit (LRU) at the landing site and transported to the sampling location, where the manipulator establishes a solid contact between the instrument and the sample. The rover commands the instrument to trigger the measurement, which in turn releases a laser-pulse and captures the spectrum of the resulting plasma. The in-contact deployment ensures a suitable focus distance for the spectrometer, without a focusing system that would add to the instrument's volume and weight, and allows for flexible deployment of the instrument. The autonomous software computes all necessary manipulation operations on-board the rover and requires almost no supervision from mission control. We tested the LRU and the LIBS instrument at the moon analogue test site on Mt. Etna, Sicily and successfully demonstrated multiple LIBS measurements, in which the rover automatically deployed the instrument on a rock sample, recorded a measurement and sent the data to mission control, with sufficient quality to distinguish the major elements of the recorded sample