6,622 research outputs found
Object segmentation in depth maps with one user click and a synthetically trained fully convolutional network
With more and more household objects built on planned obsolescence and
consumed by a fast-growing population, hazardous waste recycling has become a
critical challenge. Given the large variability of household waste, current
recycling platforms mostly rely on human operators to analyze the scene,
typically composed of many object instances piled up in bulk. Helping them by
robotizing the unitary extraction is a key challenge to speed up this tedious
process. Whereas supervised deep learning has proven very efficient for such
object-level scene understanding, e.g., generic object detection and
segmentation in everyday scenes, it however requires large sets of per-pixel
labeled images, that are hardly available for numerous application contexts,
including industrial robotics. We thus propose a step towards a practical
interactive application for generating an object-oriented robotic grasp,
requiring as inputs only one depth map of the scene and one user click on the
next object to extract. More precisely, we address in this paper the middle
issue of object seg-mentation in top views of piles of bulk objects given a
pixel location, namely seed, provided interactively by a human operator. We
propose a twofold framework for generating edge-driven instance segments.
First, we repurpose a state-of-the-art fully convolutional object contour
detector for seed-based instance segmentation by introducing the notion of
edge-mask duality with a novel patch-free and contour-oriented loss function.
Second, we train one model using only synthetic scenes, instead of manually
labeled training data. Our experimental results show that considering edge-mask
duality for training an encoder-decoder network, as we suggest, outperforms a
state-of-the-art patch-based network in the present application context.Comment: This is a pre-print of an article published in Human Friendly
Robotics, 10th International Workshop, Springer Proceedings in Advanced
Robotics, vol 7. The final authenticated version is available online at:
https://doi.org/10.1007/978-3-319-89327-3\_16, Springer Proceedings in
Advanced Robotics, Siciliano Bruno, Khatib Oussama, In press, Human Friendly
Robotics, 10th International Workshop,
Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
We propose a novel method to fit and segment multi-structural data via convex
relaxation. Unlike greedy methods --which maximise the number of inliers-- this
approach efficiently searches for a soft assignment of points to models by
minimising the energy of the overall classification. Our approach is similar to
state-of-the-art energy minimisation techniques which use a global energy.
However, we deal with the scaling factor (as the number of models increases) of
the original combinatorial problem by relaxing the solution. This relaxation
brings two advantages: first, by operating in the continuous domain we can
parallelize the calculations. Second, it allows for the use of different
metrics which results in a more general formulation.
We demonstrate the versatility of our technique on two different problems of
estimating structure from images: plane extraction from RGB-D data and
homography estimation from pairs of images. In both cases, we report accurate
results on publicly available datasets, in most of the cases outperforming the
state-of-the-art
Deep Projective 3D Semantic Segmentation
Semantic segmentation of 3D point clouds is a challenging problem with
numerous real-world applications. While deep learning has revolutionized the
field of image semantic segmentation, its impact on point cloud data has been
limited so far. Recent attempts, based on 3D deep learning approaches
(3D-CNNs), have achieved below-expected results. Such methods require
voxelizations of the underlying point cloud data, leading to decreased spatial
resolution and increased memory consumption. Additionally, 3D-CNNs greatly
suffer from the limited availability of annotated datasets.
In this paper, we propose an alternative framework that avoids the
limitations of 3D-CNNs. Instead of directly solving the problem in 3D, we first
project the point cloud onto a set of synthetic 2D-images. These images are
then used as input to a 2D-CNN, designed for semantic segmentation. Finally,
the obtained prediction scores are re-projected to the point cloud to obtain
the segmentation results. We further investigate the impact of multiple
modalities, such as color, depth and surface normals, in a multi-stream network
architecture. Experiments are performed on the recent Semantic3D dataset. Our
approach sets a new state-of-the-art by achieving a relative gain of 7.9 %,
compared to the previous best approach.Comment: Submitted to CAIP 201
- …