4,194 research outputs found
An Iterative Co-Saliency Framework for RGBD Images
As a newly emerging and significant topic in computer vision community,
co-saliency detection aims at discovering the common salient objects in
multiple related images. The existing methods often generate the co-saliency
map through a direct forward pipeline which is based on the designed cues or
initialization, but lack the refinement-cycle scheme. Moreover, they mainly
focus on RGB image and ignore the depth information for RGBD images. In this
paper, we propose an iterative RGBD co-saliency framework, which utilizes the
existing single saliency maps as the initialization, and generates the final
RGBD cosaliency map by using a refinement-cycle model. Three schemes are
employed in the proposed RGBD co-saliency framework, which include the addition
scheme, deletion scheme, and iteration scheme. The addition scheme is used to
highlight the salient regions based on intra-image depth propagation and
saliency propagation, while the deletion scheme filters the saliency regions
and removes the non-common salient regions based on interimage constraint. The
iteration scheme is proposed to obtain more homogeneous and consistent
co-saliency map. Furthermore, a novel descriptor, named depth shape prior, is
proposed in the addition scheme to introduce the depth information to enhance
identification of co-salient objects. The proposed method can effectively
exploit any existing 2D saliency model to work well in RGBD co-saliency
scenarios. The experiments on two RGBD cosaliency datasets demonstrate the
effectiveness of our proposed framework.Comment: 13 pages, 13 figures, Accepted by IEEE Transactions on Cybernetics
2017. Project URL: https://rmcong.github.io/proj_RGBD_cosal_tcyb.htm
Recurrent Attentional Networks for Saliency Detection
Convolutional-deconvolution networks can be adopted to perform end-to-end
saliency detection. But, they do not work well with objects of multiple scales.
To overcome such a limitation, in this work, we propose a recurrent attentional
convolutional-deconvolution network (RACDNN). Using spatial transformer and
recurrent network units, RACDNN is able to iteratively attend to selected image
sub-regions to perform saliency refinement progressively. Besides tackling the
scale problem, RACDNN can also learn context-aware features from past
iterations to enhance saliency refinement in future iterations. Experiments on
several challenging saliency detection datasets validate the effectiveness of
RACDNN, and show that RACDNN outperforms state-of-the-art saliency detection
methods.Comment: CVPR 201
Saliency Prediction for Mobile User Interfaces
We introduce models for saliency prediction for mobile user interfaces. A
mobile interface may include elements like buttons, text, etc. in addition to
natural images which enable performing a variety of tasks. Saliency in natural
images is a well studied area. However, given the difference in what
constitutes a mobile interface, and the usage context of these devices, we
postulate that saliency prediction for mobile interface images requires a fresh
approach. Mobile interface design involves operating on elements, the building
blocks of the interface. We first collected eye-gaze data from mobile devices
for free viewing task. Using this data, we develop a novel autoencoder based
multi-scale deep learning model that provides saliency prediction at the mobile
interface element level. Compared to saliency prediction approaches developed
for natural images, we show that our approach performs significantly better on
a range of established metrics.Comment: Paper accepted at WACV 201
Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
Top-down saliency models produce a probability map that peaks at target
locations specified by a task/goal such as object detection. They are usually
trained in a fully supervised setting involving pixel-level annotations of
objects. We propose a weakly supervised top-down saliency framework using only
binary labels that indicate the presence/absence of an object in an image.
First, the probabilistic contribution of each image region to the confidence of
a CNN-based image classifier is computed through a backtracking strategy to
produce top-down saliency. From a set of saliency maps of an image produced by
fast bottom-up saliency approaches, we select the best saliency map suitable
for the top-down task. The selected bottom-up saliency map is combined with the
top-down saliency map. Features having high combined saliency are used to train
a linear SVM classifier to estimate feature saliency. This is integrated with
combined saliency and further refined through a multi-scale
superpixel-averaging of saliency map. We evaluate the performance of the
proposed weakly supervised topdown saliency and achieve comparable performance
with fully supervised approaches. Experiments are carried out on seven
challenging datasets and quantitative results are compared with 40 closely
related approaches across 4 different applications.Comment: 14 pages, 7 figure
- …