4,035 research outputs found
A Reverse Hierarchy Model for Predicting Eye Fixations
A number of psychological and physiological evidences suggest that early
visual attention works in a coarse-to-fine way, which lays a basis for the
reverse hierarchy theory (RHT). This theory states that attention propagates
from the top level of the visual hierarchy that processes gist and abstract
information of input, to the bottom level that processes local details.
Inspired by the theory, we develop a computational model for saliency detection
in images. First, the original image is downsampled to different scales to
constitute a pyramid. Then, saliency on each layer is obtained by image
super-resolution reconstruction from the layer above, which is defined as
unpredictability from this coarse-to-fine reconstruction. Finally, saliency on
each layer of the pyramid is fused into stochastic fixations through a
probabilistic model, where attention initiates from the top layer and
propagates downward through the pyramid. Extensive experiments on two standard
eye-tracking datasets show that the proposed method can achieve competitive
results with state-of-the-art models.Comment: CVPR 2014, 27th IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). CVPR 201
Improvised Salient Object Detection and Manipulation
In case of salient subject recognition, computer algorithms have been heavily
relied on scanning of images from top-left to bottom-right systematically and
apply brute-force when attempting to locate objects of interest. Thus, the
process turns out to be quite time consuming. Here a novel approach and a
simple solution to the above problem is discussed. In this paper, we implement
an approach to object manipulation and detection through segmentation map,
which would help to desaturate or, in other words, wash out the background of
the image. Evaluation for the performance is carried out using the Jaccard
index against the well-known Ground-truth target box technique.Comment: 7 page
Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
Top-down saliency models produce a probability map that peaks at target
locations specified by a task/goal such as object detection. They are usually
trained in a fully supervised setting involving pixel-level annotations of
objects. We propose a weakly supervised top-down saliency framework using only
binary labels that indicate the presence/absence of an object in an image.
First, the probabilistic contribution of each image region to the confidence of
a CNN-based image classifier is computed through a backtracking strategy to
produce top-down saliency. From a set of saliency maps of an image produced by
fast bottom-up saliency approaches, we select the best saliency map suitable
for the top-down task. The selected bottom-up saliency map is combined with the
top-down saliency map. Features having high combined saliency are used to train
a linear SVM classifier to estimate feature saliency. This is integrated with
combined saliency and further refined through a multi-scale
superpixel-averaging of saliency map. We evaluate the performance of the
proposed weakly supervised topdown saliency and achieve comparable performance
with fully supervised approaches. Experiments are carried out on seven
challenging datasets and quantitative results are compared with 40 closely
related approaches across 4 different applications.Comment: 14 pages, 7 figure
Salient Object Detection Combining a Self-attention Module and a Feature Pyramid Network
Salient object detection has achieved great improvement by using the Fully
Convolution Network (FCN). However, the FCN-based U-shape architecture may
cause the dilution problem in the high-level semantic information during the
up-sample operations in the top-down pathway. Thus, it can weaken the ability
of salient object localization and produce degraded boundaries. To this end, in
order to overcome this limitation, we propose a novel pyramid self-attention
module (PSAM) and the adoption of an independent feature-complementing
strategy. In PSAM, self-attention layers are equipped after multi-scale pyramid
features to capture richer high-level features and bring larger receptive
fields to the model. In addition, a channel-wise attention module is also
employed to reduce the redundant features of the FPN and provide refined
results. Experimental analysis shows that the proposed PSAM effectively
contributes to the whole model so that it outperforms state-of-the-art results
over five challenging datasets. Finally, quantitative results show that PSAM
generates clear and integral salient maps which can provide further help to
other computer vision tasks, such as object detection and semantic
segmentation
- …