129 research outputs found
Enhancing semantic segmentation with detection priors and iterated graph cuts for robotics
To foster human\u2013robot interaction, autonomous robots need to understand the environment in which they operate. In this context, one of the main challenges is semantic segmentation, together with the recognition of important objects, which can aid robots during exploration, as well as when planning new actions and interacting with the environment. In this study, we extend a multi-view semantic segmentation system based on 3D Entangled Forests (3DEF) by integrating and refining two object detectors, Mask R-CNN and You Only Look Once (YOLO), with Bayesian fusion and iterated graph cuts. The new system takes the best of its components, successfully exploiting both 2D and 3D data. Our experiments show that our approach is competitive with the state-of-the-art and leads to accurate semantic segmentations
Real-time tracking of 3D elastic objects with an RGB-D sensor
This paper presents a method to track in real-time a 3D textureless object which undergoes large deformations such as elastic ones, and rigid motions, using the point cloud data provided by an RGB-D sensor. This solution is expected to be useful for enhanced manipulation of humanoid robotic systems. Our framework relies on a prior visual segmentation of the object in the image. The segmented point cloud is registered first in a rigid manner and then by non-rigidly fitting the mesh, based on the Finite Element Method to model elasticity, and on geometrical point-to-point correspondences to compute external forces exerted on the mesh. The real-time performance of the system is demonstrated on synthetic and real data involving challenging deformations and motions
DepthCut: Improved Depth Edge Estimation Using Multiple Unreliable Channels
In the context of scene understanding, a variety of methods exists to
estimate different information channels from mono or stereo images, including
disparity, depth, and normals. Although several advances have been reported in
the recent years for these tasks, the estimated information is often imprecise
particularly near depth discontinuities or creases. Studies have however shown
that precisely such depth edges carry critical cues for the perception of
shape, and play important roles in tasks like depth-based segmentation or
foreground selection. Unfortunately, the currently extracted channels often
carry conflicting signals, making it difficult for subsequent applications to
effectively use them. In this paper, we focus on the problem of obtaining
high-precision depth edges (i.e., depth contours and creases) by jointly
analyzing such unreliable information channels. We propose DepthCut, a
data-driven fusion of the channels using a convolutional neural network trained
on a large dataset with known depth. The resulting depth edges can be used for
segmentation, decomposing a scene into depth layers with relatively flat depth,
or improving the accuracy of the depth estimate near depth edges by
constraining its gradients to agree with these edges. Quantitatively, we
compare against 15 variants of baselines and demonstrate that our depth edges
result in an improved segmentation performance and an improved depth estimate
near depth edges compared to data-agnostic channel fusion. Qualitatively, we
demonstrate that the depth edges result in superior segmentation and depth
orderings.Comment: 12 page
Interactive Segmentation of Radiance Fields
Radiance Fields (RF) are popular to represent casually-captured scenes for
new view generation and have been used for applications beyond it.
Understanding and manipulating scenes represented as RFs have to naturally
follow to facilitate mixed reality on personal spaces. Semantic segmentation of
objects in the 3D scene is an important step for that. Prior segmentation
efforts using feature distillation show promise but don't scale to complex
objects with diverse appearance. We present a framework to interactively
segment objects with fine structure. Nearest neighbor feature matching
identifies high-confidence regions of the objects using distilled features.
Bilateral filtering in a joint spatio-semantic space grows the region to
recover accurate segmentation. We show state-of-the-art results of segmenting
objects from RFs and compositing them to another scene, changing appearance,
etc., moving closer to rich scene manipulation and understanding.
Project Page: https://rahul-goel.github.io/isrf/Comment: Project Page: https://rahul-goel.github.io/isrf
Analysis of the hands in egocentric vision: A survey
Egocentric vision (a.k.a. first-person vision - FPV) applications have
thrived over the past few years, thanks to the availability of affordable
wearable cameras and large annotated datasets. The position of the wearable
camera (usually mounted on the head) allows recording exactly what the camera
wearers have in front of them, in particular hands and manipulated objects.
This intrinsic advantage enables the study of the hands from multiple
perspectives: localizing hands and their parts within the images; understanding
what actions and activities the hands are involved in; and developing
human-computer interfaces that rely on hand gestures. In this survey, we review
the literature that focuses on the hands using egocentric vision, categorizing
the existing approaches into: localization (where are the hands or parts of
them?); interpretation (what are the hands doing?); and application (e.g.,
systems that used egocentric hand cues for solving a specific problem).
Moreover, a list of the most prominent datasets with hand-based annotations is
provided
Visual Perception of Garments for their Robotic Manipulation
TĂ©matem pĹ™edloĹľenĂ© práce je strojovĂ© vnĂmánĂ textiliĂ zaloĹľenĂ© na obrazovĂ© informaci a vyuĹľitĂ© pro jejich robotickou manipulaci. Práce studuje nÄ›kolik reprezentativnĂch textiliĂ v běžnĂ˝ch kognitivnÄ›-manipulaÄŤnĂch Ăşlohách, jako je napĹ™Ăklad tĹ™ĂdÄ›nĂ neznámĂ˝ch odÄ›vĹŻ podle typu nebo jejich skládánĂ. NÄ›kterĂ© z tÄ›chto ÄŤinnostĂ by v budoucnu mohly bĂ˝t vykonávány domácĂmi robotickĂ˝mi pomocnĂky. Strojová manipulace s textiliemi je poptávaná takĂ© v prĹŻmyslu. HlavnĂ vĂ˝zvou Ĺ™ešenĂ©ho problĂ©mu je mÄ›kkost a s tĂm souvisejĂcĂ vysoká deformovatelnost textiliĂ, kterĂ© se tak mohou nacházet v bezpoÄŤtu vizuálnÄ› velmi odlišnĂ˝ch stavĹŻ.The presented work addresses the visual perception of garments applied for their robotic manipulation. Various types of garments are considered in the typical perception and manipulation tasks, including their classification, folding or unfolding. Our work is motivated by the possibility of having humanoid household robots performing these tasks for us in the future, as well as by the industrial applications. The main challenge is the high deformability of garments, which can be posed in infinitely many configurations with a significantly varying appearance
- …