3,429 research outputs found
Object segmentation in depth maps with one user click and a synthetically trained fully convolutional network
With more and more household objects built on planned obsolescence and
consumed by a fast-growing population, hazardous waste recycling has become a
critical challenge. Given the large variability of household waste, current
recycling platforms mostly rely on human operators to analyze the scene,
typically composed of many object instances piled up in bulk. Helping them by
robotizing the unitary extraction is a key challenge to speed up this tedious
process. Whereas supervised deep learning has proven very efficient for such
object-level scene understanding, e.g., generic object detection and
segmentation in everyday scenes, it however requires large sets of per-pixel
labeled images, that are hardly available for numerous application contexts,
including industrial robotics. We thus propose a step towards a practical
interactive application for generating an object-oriented robotic grasp,
requiring as inputs only one depth map of the scene and one user click on the
next object to extract. More precisely, we address in this paper the middle
issue of object seg-mentation in top views of piles of bulk objects given a
pixel location, namely seed, provided interactively by a human operator. We
propose a twofold framework for generating edge-driven instance segments.
First, we repurpose a state-of-the-art fully convolutional object contour
detector for seed-based instance segmentation by introducing the notion of
edge-mask duality with a novel patch-free and contour-oriented loss function.
Second, we train one model using only synthetic scenes, instead of manually
labeled training data. Our experimental results show that considering edge-mask
duality for training an encoder-decoder network, as we suggest, outperforms a
state-of-the-art patch-based network in the present application context.Comment: This is a pre-print of an article published in Human Friendly
Robotics, 10th International Workshop, Springer Proceedings in Advanced
Robotics, vol 7. The final authenticated version is available online at:
https://doi.org/10.1007/978-3-319-89327-3\_16, Springer Proceedings in
Advanced Robotics, Siciliano Bruno, Khatib Oussama, In press, Human Friendly
Robotics, 10th International Workshop,
Establishing the behavioural limits for countershaded camouflage
Countershading is a ubiquitous patterning of animals whereby the side that typically faces the highest illumination is darker. When tuned to specific lighting conditions and body orientation with respect to the light field, countershading minimizes the gradient of light the body reflects by counterbalancing shadowing due to illumination, and has therefore classically been thought of as an adaptation for visual camouflage. However, whether and how crypsis degrades when body orientation with respect to the light field is non-optimal has never been studied. We tested the behavioural limits on body orientation for countershading to deliver effective visual camouflage. We asked human participants to detect a countershaded target in a simulated three-dimensional environment. The target was optimally coloured for crypsis in a reference orientation and was displayed at different orientations. Search performance dramatically improved for deviations beyond 15 degrees. Detection time was significantly shorter and accuracy significantly higher than when the target orientation matched the countershading pattern. This work demonstrates the importance of maintaining body orientation appropriate for the displayed camouflage pattern, suggesting a possible selective pressure for animals to orient themselves appropriately to enhance crypsis
Children, Humanoid Robots and Caregivers
This paper presents developmental learning on a humanoid robot from human-robot interactions. We consider in particular teaching humanoids as children during the child's Separation and Individuation developmental phase (Mahler, 1979). Cognitive development during this phase is characterized both by the child's dependence on her mother for learning while becoming awareness of her own individuality, and by self-exploration of her physical surroundings. We propose a learning framework for a humanoid robot inspired on such cognitive development
Learning to Divide and Conquer for Online Multi-Target Tracking
Online Multiple Target Tracking (MTT) is often addressed within the
tracking-by-detection paradigm. Detections are previously extracted
independently in each frame and then objects trajectories are built by
maximizing specifically designed coherence functions. Nevertheless, ambiguities
arise in presence of occlusions or detection errors. In this paper we claim
that the ambiguities in tracking could be solved by a selective use of the
features, by working with more reliable features if possible and exploiting a
deeper representation of the target only if necessary. To this end, we propose
an online divide and conquer tracker for static camera scenes, which partitions
the assignment problem in local subproblems and solves them by selectively
choosing and combining the best features. The complete framework is cast as a
structural learning task that unifies these phases and learns tracker
parameters from examples. Experiments on two different datasets highlights a
significant improvement of tracking performances (MOTA +10%) over the state of
the art
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding
Recent trends in image understanding have pushed for holistic scene
understanding models that jointly reason about various tasks such as object
detection, scene recognition, shape analysis, contextual reasoning, and local
appearance based classifiers. In this work, we are interested in understanding
the roles of these different tasks in improved scene understanding, in
particular semantic segmentation, object detection and scene recognition.
Towards this goal, we "plug-in" human subjects for each of the various
components in a state-of-the-art conditional random field model. Comparisons
among various hybrid human-machine CRFs give us indications of how much "head
room" there is to improve scene understanding by focusing research efforts on
various individual tasks
- …