20,989 research outputs found
SE2Net: Siamese Edge-Enhancement Network for Salient Object Detection
Deep convolutional neural network significantly boosted the capability of
salient object detection in handling large variations of scenes and object
appearances. However, convolution operations seek to generate strong responses
on individual pixels, while lack the ability to maintain the spatial structure
of objects. Moreover, the down-sampling operations, such as pooling and
striding, lose spatial details of the salient objects. In this paper, we
propose a simple yet effective Siamese Edge-Enhancement Network (SE2Net) to
preserve the edge structure for salient object detection. Specifically, a novel
multi-stage siamese network is built to aggregate the low-level and high-level
features, and parallelly estimate the salient maps of edges and regions. As a
result, the predicted regions become more accurate by enhancing the responses
at edges, and the predicted edges become more semantic by suppressing the false
positives in background. After the refined salient maps of edges and regions
are produced by the SE2Net, an edge-guided inference algorithm is designed to
further improve the resulting salient masks along the predicted edges.
Extensive experiments on several benchmark datasets have been conducted, which
show that our method is superior than the state-of-the-art approaches
Edge-guided Non-local Fully Convolutional Network for Salient Object Detection
Fully Convolutional Neural Network (FCN) has been widely applied to salient
object detection recently by virtue of high-level semantic feature extraction,
but existing FCN based methods still suffer from continuous striding and
pooling operations leading to loss of spatial structure and blurred edges. To
maintain the clear edge structure of salient objects, we propose a novel
Edge-guided Non-local FCN (ENFNet) to perform edge guided feature learning for
accurate salient object detection. In a specific, we extract hierarchical
global and local information in FCN to incorporate non-local features for
effective feature representations. To preserve good boundaries of salient
objects, we propose a guidance block to embed edge prior knowledge into
hierarchical feature maps. The guidance block not only performs feature-wise
manipulation but also spatial-wise transformation for effective edge
embeddings. Our model is trained on the MSRA-B dataset and tested on five
popular benchmark datasets. Comparing with the state-of-the-art methods, the
proposed method achieves the best performance on all datasets.Comment: 10 pages, 6 figure
OGNet: Salient Object Detection with Output-guided Attention Module
Attention mechanisms are widely used in salient object detection models based
on deep learning, which can effectively promote the extraction and utilization
of useful information by neural networks. However, most of the existing
attention modules used in salient object detection are input with the processed
feature map itself, which easily leads to the problem of `blind
overconfidence'. In this paper, instead of applying the widely used
self-attention module, we present an output-guided attention module built with
multi-scale outputs to overcome the problem of `blind overconfidence'. We also
construct a new loss function, the intractable area F-measure loss function,
which is based on the F-measure of the hard-to-handle area to improve the
detection effect of the model in the edge areas and confusing areas of an
image. Extensive experiments and abundant ablation studies are conducted to
evaluate the effect of our methods and to explore the most suitable structure
for the model. Tests on several data sets show that our model performs very
well, even though it is very lightweight.Comment: submitted to IEEE Transactions on Circuits and Systems for Video
Technolog
Deep Edge-Aware Saliency Detection
There has been profound progress in visual saliency thanks to the deep
learning architectures, however, there still exist three major challenges that
hinder the detection performance for scenes with complex compositions, multiple
salient objects, and salient objects of diverse scales. In particular, output
maps of the existing methods remain low in spatial resolution causing blurred
edges due to the stride and pooling operations, networks often neglect
descriptive statistical and handcrafted priors that have potential to
complement saliency detection results, and deep features at different layers
stay mainly desolate waiting to be effectively fused to handle multi-scale
salient objects. In this paper, we tackle these issues by a new fully
convolutional neural network that jointly learns salient edges and saliency
labels in an end-to-end fashion. Our framework first employs convolutional
layers that reformulate the detection task as a dense labeling problem, then
integrates handcrafted saliency features in a hierarchical manner into lower
and higher levels of the deep network to leverage available information for
multi-scale response, and finally refines the saliency map through dilated
convolutions by imposing context. In this way, the salient edge priors are
efficiently incorporated and the output resolution is significantly improved
while keeping the memory requirements low, leading to cleaner and sharper
object boundaries. Extensive experimental analyses on ten benchmarks
demonstrate that our framework achieves consistently superior performance and
attains robustness for complex scenes in comparison to the very recent
state-of-the-art approaches.Comment: 13 pages, 11 figure
Salient Object Detection in the Deep Learning Era: An In-Depth Survey
As an essential problem in computer vision, salient object detection (SOD)
has attracted an increasing amount of research attention over the years. Recent
advances in SOD are predominantly led by deep learning-based solutions (named
deep SOD). To enable in-depth understanding of deep SOD, in this paper, we
provide a comprehensive survey covering various aspects, ranging from algorithm
taxonomy to unsolved issues. In particular, we first review deep SOD algorithms
from different perspectives, including network architecture, level of
supervision, learning paradigm, and object-/instance-level detection. Following
that, we summarize and analyze existing SOD datasets and evaluation metrics.
Then, we benchmark a large group of representative SOD models, and provide
detailed analyses of the comparison results. Moreover, we study the performance
of SOD algorithms under different attribute settings, which has not been
thoroughly explored previously, by constructing a novel SOD dataset with rich
attribute annotations covering various salient object types, challenging
factors, and scene categories. We further analyze, for the first time in the
field, the robustness of SOD models to random input perturbations and
adversarial attacks. We also look into the generalization and difficulty of
existing SOD datasets. Finally, we discuss several open issues of SOD and
outline future research directions.Comment: Published on IEEE TPAMI. All the saliency prediction maps, our
constructed dataset with annotations, and codes for evaluation are publicly
available at \url{https://github.com/wenguanwang/SODsurvey
A novel graph structure for salient object detection based on divergence background and compact foreground
In this paper, we propose an efficient and discriminative model for salient
object detection. Our method is carried out in a stepwise mechanism based on
both divergence background and compact foreground cues. In order to effectively
enhance the distinction between nodes along object boundaries and the
similarity among object regions, a graph is constructed by introducing the
concept of virtual node. To remove incorrect outputs, a scheme for selecting
background seeds and a method for generating compactness foreground regions are
introduced, respectively. Different from prior methods, we calculate the
saliency value of each node based on the relationship between the corresponding
node and the virtual node. In order to achieve significant performance
improvement consistently, we propose an Extended Manifold Ranking (EMR)
algorithm, which subtly combines suppressed / active nodes and mid-level
information. Extensive experimental results demonstrate that the proposed
algorithm performs favorably against the state-of-art saliency detection
methods in terms of different evaluation metrics on several benchmark datasets.Comment: 22 pages,16 figures, 2 table
Selectivity or Invariance: Boundary-aware Salient Object Detection
Typically, a salient object detection (SOD) model faces opposite requirements
in processing object interiors and boundaries. The features of interiors should
be invariant to strong appearance change so as to pop-out the salient object as
a whole, while the features of boundaries should be selective to slight
appearance change to distinguish salient objects and background. To address
this selectivity-invariance dilemma, we propose a novel boundary-aware network
with successive dilation for image-based SOD. In this network, the feature
selectivity at boundaries is enhanced by incorporating a boundary localization
stream, while the feature invariance at interiors is guaranteed with a complex
interior perception stream. Moreover, a transition compensation stream is
adopted to amend the probable failures in transitional regions between
interiors and boundaries. In particular, an integrated successive dilation
module is proposed to enhance the feature invariance at interiors and
transitional regions. Extensive experiments on six datasets show that the
proposed approach outperforms 16 state-of-the-art methods
Review of Visual Saliency Detection with Comprehensive Information
Visual saliency detection model simulates the human visual system to perceive
the scene, and has been widely used in many vision tasks. With the acquisition
technology development, more comprehensive information, such as depth cue,
inter-image correspondence, or temporal relationship, is available to extend
image saliency detection to RGBD saliency detection, co-saliency detection, or
video saliency detection. RGBD saliency detection model focuses on extracting
the salient regions from RGBD images by combining the depth information.
Co-saliency detection model introduces the inter-image correspondence
constraint to discover the common salient object in an image group. The goal of
video saliency detection model is to locate the motion-related salient object
in video sequences, which considers the motion cue and spatiotemporal
constraint jointly. In this paper, we review different types of saliency
detection algorithms, summarize the important issues of the existing methods,
and discuss the existent problems and future works. Moreover, the evaluation
datasets and quantitative measurements are briefly introduced, and the
experimental analysis and discission are conducted to provide a holistic
overview of different saliency detection methods.Comment: 18 pages, 11 figures, 7 tables, Accepted by IEEE Transactions on
Circuits and Systems for Video Technology 2018, https://rmcong.github.io
Saliency-Guided Perceptual Grouping Using Motion Cues in Region-Based Artificial Visual Attention
Region-based artificial attention constitutes a framework for bio-inspired
attentional processes on an intermediate abstraction level for the use in
computer vision and mobile robotics. Segmentation algorithms produce regions of
coherently colored pixels. These serve as proto-objects on which the
attentional processes determine image portions of relevance. A single
region---which not necessarily represents a full object---constitutes the focus
of attention. For many post-attentional tasks, however, such as identifying or
tracking objects, single segments are not sufficient. Here, we present a
saliency-guided approach that groups regions that potentially belong to the
same object based on proximity and similarity of motion. We compare our results
to object selection by thresholding saliency maps and a further
attention-guided strategy
SAC-Net: Spatial Attenuation Context for Salient Object Detection
This paper presents a new deep neural network design for salient object
detection by maximizing the integration of local and global image context
within, around, and beyond the salient objects. Our key idea is to adaptively
propagate and aggregate the image context features with variable attenuation
over the entire feature maps. To achieve this, we design the spatial
attenuation context (SAC) module to recurrently translate and aggregate the
context features independently with different attenuation factors and then to
attentively learn the weights to adaptively integrate the aggregated context
features. By further embedding the module to process individual layers in a
deep network, namely SAC-Net, we can train the network end-to-end and optimize
the context features for detecting salient objects. Compared with 29
state-of-the-art methods, experimental results show that our method performs
favorably over all the others on six common benchmark data, both quantitatively
and visually
- …