151 research outputs found
An Iterative Co-Saliency Framework for RGBD Images
As a newly emerging and significant topic in computer vision community,
co-saliency detection aims at discovering the common salient objects in
multiple related images. The existing methods often generate the co-saliency
map through a direct forward pipeline which is based on the designed cues or
initialization, but lack the refinement-cycle scheme. Moreover, they mainly
focus on RGB image and ignore the depth information for RGBD images. In this
paper, we propose an iterative RGBD co-saliency framework, which utilizes the
existing single saliency maps as the initialization, and generates the final
RGBD cosaliency map by using a refinement-cycle model. Three schemes are
employed in the proposed RGBD co-saliency framework, which include the addition
scheme, deletion scheme, and iteration scheme. The addition scheme is used to
highlight the salient regions based on intra-image depth propagation and
saliency propagation, while the deletion scheme filters the saliency regions
and removes the non-common salient regions based on interimage constraint. The
iteration scheme is proposed to obtain more homogeneous and consistent
co-saliency map. Furthermore, a novel descriptor, named depth shape prior, is
proposed in the addition scheme to introduce the depth information to enhance
identification of co-salient objects. The proposed method can effectively
exploit any existing 2D saliency model to work well in RGBD co-saliency
scenarios. The experiments on two RGBD cosaliency datasets demonstrate the
effectiveness of our proposed framework.Comment: 13 pages, 13 figures, Accepted by IEEE Transactions on Cybernetics
2017. Project URL: https://rmcong.github.io/proj_RGBD_cosal_tcyb.htm
RGB-D Salient Object Detection: A Survey
Salient object detection (SOD), which simulates the human visual perception
system to locate the most attractive object(s) in a scene, has been widely
applied to various computer vision tasks. Now, with the advent of depth
sensors, depth maps with affluent spatial information that can be beneficial in
boosting the performance of SOD, can easily be captured. Although various RGB-D
based SOD models with promising performance have been proposed over the past
several years, an in-depth understanding of these models and challenges in this
topic remains lacking. In this paper, we provide a comprehensive survey of
RGB-D based SOD models from various perspectives, and review related benchmark
datasets in detail. Further, considering that the light field can also provide
depth maps, we review SOD models and popular benchmark datasets from this
domain as well. Moreover, to investigate the SOD ability of existing models, we
carry out a comprehensive evaluation, as well as attribute-based evaluation of
several representative RGB-D based SOD models. Finally, we discuss several
challenges and open directions of RGB-D based SOD for future research. All
collected models, benchmark datasets, source code links, datasets constructed
for attribute-based evaluation, and codes for evaluation will be made publicly
available at https://github.com/taozh2017/RGBDSODsurveyComment: 24 pages, 12 figures. Has been accepted by Computational Visual Medi
CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection
Focusing on the issue of how to effectively capture and utilize
cross-modality information in RGB-D salient object detection (SOD) task, we
present a convolutional neural network (CNN) model, named CIR-Net, based on the
novel cross-modality interaction and refinement. For the cross-modality
interaction, 1) a progressive attention guided integration unit is proposed to
sufficiently integrate RGB-D feature representations in the encoder stage, and
2) a convergence aggregation structure is proposed, which flows the RGB and
depth decoding features into the corresponding RGB-D decoding streams via an
importance gated fusion unit in the decoder stage. For the cross-modality
refinement, we insert a refinement middleware structure between the encoder and
the decoder, in which the RGB, depth, and RGB-D encoder features are further
refined by successively using a self-modality attention refinement unit and a
cross-modality weighting refinement unit. At last, with the gradually refined
features, we predict the saliency map in the decoder stage. Extensive
experiments on six popular RGB-D SOD benchmarks demonstrate that our network
outperforms the state-of-the-art saliency detectors both qualitatively and
quantitatively.Comment: Accepted by IEEE Transactions on Image Processing 2022, 16 pages, 11
figure
RGBD Salient Object Detection via Deep Fusion
Numerous efforts have been made to design different low level saliency cues
for the RGBD saliency detection, such as color or depth contrast features,
background and color compactness priors. However, how these saliency cues
interact with each other and how to incorporate these low level saliency cues
effectively to generate a master saliency map remain a challenging problem. In
this paper, we design a new convolutional neural network (CNN) to fuse
different low level saliency cues into hierarchical features for automatically
detecting salient objects in RGBD images. In contrast to the existing works
that directly feed raw image pixels to the CNN, the proposed method takes
advantage of the knowledge in traditional saliency detection by adopting
various meaningful and well-designed saliency feature vectors as input. This
can guide the training of CNN towards detecting salient object more effectively
due to the reduced learning ambiguity. We then integrate a Laplacian
propagation framework with the learned CNN to extract a spatially consistent
saliency map by exploiting the intrinsic structure of the input image.
Extensive quantitative and qualitative experimental evaluations on three
datasets demonstrate that the proposed method consistently outperforms
state-of-the-art methods.Comment: This paper has been submitted to IEEE Transactions on Image
Processin
Recurrent Attentional Networks for Saliency Detection
Convolutional-deconvolution networks can be adopted to perform end-to-end
saliency detection. But, they do not work well with objects of multiple scales.
To overcome such a limitation, in this work, we propose a recurrent attentional
convolutional-deconvolution network (RACDNN). Using spatial transformer and
recurrent network units, RACDNN is able to iteratively attend to selected image
sub-regions to perform saliency refinement progressively. Besides tackling the
scale problem, RACDNN can also learn context-aware features from past
iterations to enhance saliency refinement in future iterations. Experiments on
several challenging saliency detection datasets validate the effectiveness of
RACDNN, and show that RACDNN outperforms state-of-the-art saliency detection
methods.Comment: CVPR 201
- …