4,766 research outputs found
Hierarchical Salient Object Detection for Assisted Grasping
Visual scene decomposition into semantic entities is one of the major
challenges when creating a reliable object grasping system. Recently, we
introduced a bottom-up hierarchical clustering approach which is able to
segment objects and parts in a scene. In this paper, we introduce a transform
from such a segmentation into a corresponding, hierarchical saliency function.
In comprehensive experiments we demonstrate its ability to detect salient
objects in a scene. Furthermore, this hierarchical saliency defines a most
salient corresponding region (scale) for every point in an image. Based on
this, an easy-to-use pick and place manipulation system was developed and
tested exemplarily.Comment: Accepted for ICRA 201
Scale-Adaptive Neural Dense Features: Learning via Hierarchical Context Aggregation
How do computers and intelligent agents view the world around them? Feature
extraction and representation constitutes one the basic building blocks towards
answering this question. Traditionally, this has been done with carefully
engineered hand-crafted techniques such as HOG, SIFT or ORB. However, there is
no ``one size fits all'' approach that satisfies all requirements. In recent
years, the rising popularity of deep learning has resulted in a myriad of
end-to-end solutions to many computer vision problems. These approaches, while
successful, tend to lack scalability and can't easily exploit information
learned by other systems. Instead, we propose SAND features, a dedicated deep
learning solution to feature extraction capable of providing hierarchical
context information. This is achieved by employing sparse relative labels
indicating relationships of similarity/dissimilarity between image locations.
The nature of these labels results in an almost infinite set of dissimilar
examples to choose from. We demonstrate how the selection of negative examples
during training can be used to modify the feature space and vary it's
properties. To demonstrate the generality of this approach, we apply the
proposed features to a multitude of tasks, each requiring different properties.
This includes disparity estimation, semantic segmentation, self-localisation
and SLAM. In all cases, we show how incorporating SAND features results in
better or comparable results to the baseline, whilst requiring little to no
additional training. Code can be found at:
https://github.com/jspenmar/SAND_featuresComment: CVPR201
Background Prompting for Improved Object Depth
Estimating the depth of objects from a single image is a valuable task for
many vision, robotics, and graphics applications. However, current methods
often fail to produce accurate depth for objects in diverse scenes. In this
work, we propose a simple yet effective Background Prompting strategy that
adapts the input object image with a learned background. We learn the
background prompts only using small-scale synthetic object datasets. To infer
object depth on a real image, we place the segmented object into the learned
background prompt and run off-the-shelf depth networks. Background Prompting
helps the depth networks focus on the foreground object, as they are made
invariant to background variations. Moreover, Background Prompting minimizes
the domain gap between synthetic and real object images, leading to better
sim2real generalization than simple finetuning. Results on multiple synthetic
and real datasets demonstrate consistent improvements in real object depths for
a variety of existing depth networks. Code and optimized background prompts can
be found at: https://mbaradad.github.io/depth_prompt
- …