4,962 research outputs found
A Review of Co-saliency Detection Technique: Fundamentals, Applications, and Challenges
Co-saliency detection is a newly emerging and rapidly growing research area
in computer vision community. As a novel branch of visual saliency, co-saliency
detection refers to the discovery of common and salient foregrounds from two or
more relevant images, and can be widely used in many computer vision tasks. The
existing co-saliency detection algorithms mainly consist of three components:
extracting effective features to represent the image regions, exploring the
informative cues or factors to characterize co-saliency, and designing
effective computational frameworks to formulate co-saliency. Although numerous
methods have been developed, the literature is still lacking a deep review and
evaluation of co-saliency detection techniques. In this paper, we aim at
providing a comprehensive review of the fundamentals, challenges, and
applications of co-saliency detection. Specifically, we provide an overview of
some related computer vision works, review the history of co-saliency
detection, summarize and categorize the major algorithms in this research area,
discuss some open issues in this area, present the potential applications of
co-saliency detection, and finally point out some unsolved challenges and
promising future works. We expect this review to be beneficial to both fresh
and senior researchers in this field, and give insights to researchers in other
related areas regarding the utility of co-saliency detection algorithms.Comment: 28 pages, 12 figures, 3 table
Hierarchical Saliency Detection on Extended CSSD
Complex structures commonly exist in natural images. When an image contains
small-scale high-contrast patterns either in the background or foreground,
saliency detection could be adversely affected, resulting erroneous and
non-uniform saliency assignment. The issue forms a fundamental challenge for
prior methods. We tackle it from a scale point of view and propose a
multi-layer approach to analyze saliency cues. Different from varying patch
sizes or downsizing images, we measure region-based scales. The final saliency
values are inferred optimally combining all the saliency cues in different
scales using hierarchical inference. Through our inference model, single-scale
information is selected to obtain a saliency map. Our method improves detection
quality on many images that cannot be handled well traditionally. We also
construct an extended Complex Scene Saliency Dataset (ECSSD) to include complex
but general natural images.Comment: 14 pages, 15 figure
Region-Based Multiscale Spatiotemporal Saliency for Video
Detecting salient objects from a video requires exploiting both spatial and
temporal knowledge included in the video. We propose a novel region-based
multiscale spatiotemporal saliency detection method for videos, where static
features and dynamic features computed from the low and middle levels are
combined together. Our method utilizes such combined features spatially over
each frame and, at the same time, temporally across frames using consistency
between consecutive frames. Saliency cues in our method are analyzed through a
multiscale segmentation model, and fused across scale levels, yielding to
exploring regions efficiently. An adaptive temporal window using motion
information is also developed to combine saliency values of consecutive frames
in order to keep temporal consistency across frames. Performance evaluation on
several popular benchmark datasets validates that our method outperforms
existing state-of-the-arts
Review of Visual Saliency Detection with Comprehensive Information
Visual saliency detection model simulates the human visual system to perceive
the scene, and has been widely used in many vision tasks. With the acquisition
technology development, more comprehensive information, such as depth cue,
inter-image correspondence, or temporal relationship, is available to extend
image saliency detection to RGBD saliency detection, co-saliency detection, or
video saliency detection. RGBD saliency detection model focuses on extracting
the salient regions from RGBD images by combining the depth information.
Co-saliency detection model introduces the inter-image correspondence
constraint to discover the common salient object in an image group. The goal of
video saliency detection model is to locate the motion-related salient object
in video sequences, which considers the motion cue and spatiotemporal
constraint jointly. In this paper, we review different types of saliency
detection algorithms, summarize the important issues of the existing methods,
and discuss the existent problems and future works. Moreover, the evaluation
datasets and quantitative measurements are briefly introduced, and the
experimental analysis and discission are conducted to provide a holistic
overview of different saliency detection methods.Comment: 18 pages, 11 figures, 7 tables, Accepted by IEEE Transactions on
Circuits and Systems for Video Technology 2018, https://rmcong.github.io
Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph
This paper presents a co-salient object detection method to find common
salient regions in a set of images. We utilize deep saliency networks to
transfer co-saliency prior knowledge and better capture high-level semantic
information, and the resulting initial co-saliency maps are enhanced by seed
propagation steps over an integrated graph. The deep saliency networks are
trained in a supervised manner to avoid online weakly supervised learning and
exploit them not only to extract high-level features but also to produce both
intra- and inter-image saliency maps. Through a refinement step, the initial
co-saliency maps can uniformly highlight co-salient regions and locate accurate
object boundaries. To handle input image groups inconsistent in size, we
propose to pool multi-regional descriptors including both within-segment and
within-group information. In addition, the integrated multilayer graph is
constructed to find the regions that the previous steps may not detect by seed
propagation with low-level descriptors. In this work, we utilize the useful
complementary components of high-, low-level information, and several
learning-based steps. Our experiments have demonstrated that the proposed
approach outperforms comparable co-saliency detection methods on widely used
public databases and can also be directly applied to co-segmentation tasks.Comment: 13 pages, 10 figures, 3 table
MSDNN: Multi-Scale Deep Neural Network for Salient Object Detection
Salient object detection is a fundamental problem and has been received a
great deal of attentions in computer vision. Recently deep learning model
became a powerful tool for image feature extraction. In this paper, we propose
a multi-scale deep neural network (MSDNN) for salient object detection. The
proposed model first extracts global high-level features and context
information over the whole source image with recurrent convolutional neural
network (RCNN). Then several stacked deconvolutional layers are adopted to get
the multi-scale feature representation and obtain a series of saliency maps.
Finally, we investigate a fusion convolution module (FCM) to build a final
pixel level saliency map. The proposed model is extensively evaluated on four
salient object detection benchmark datasets. Results show that our deep model
significantly outperforms other 12 state-of-the-art approaches.Comment: 10 pages, 12 figure
Spatiotemporal Knowledge Distillation for Efficient Estimation of Aerial Video Saliency
The performance of video saliency estimation techniques has achieved
significant advances along with the rapid development of Convolutional Neural
Networks (CNNs). However, devices like cameras and drones may have limited
computational capability and storage space so that the direct deployment of
complex deep saliency models becomes infeasible. To address this problem, this
paper proposes a dynamic saliency estimation approach for aerial videos via
spatiotemporal knowledge distillation. In this approach, five components are
involved, including two teachers, two students and the desired spatiotemporal
model. The knowledge of spatial and temporal saliency is first separately
transferred from the two complex and redundant teachers to their simple and
compact students, and the input scenes are also degraded from high-resolution
to low-resolution to remove the probable data redundancy so as to greatly speed
up the feature extraction process. After that, the desired spatiotemporal model
is further trained by distilling and encoding the spatial and temporal saliency
knowledge of two students into a unified network. In this manner, the
inter-model redundancy can be further removed for the effective estimation of
dynamic saliency on aerial videos. Experimental results show that the proposed
approach outperforms ten state-of-the-art models in estimating visual saliency
on aerial videos, while its speed reaches up to 28,738 FPS on the GPU platform
Salient Object Detection: A Distinctive Feature Integration Model
We propose a novel method for salient object detection in different images.
Our method integrates spatial features for efficient and robust representation
to capture meaningful information about the salient objects. We then train a
conditional random field (CRF) using the integrated features. The trained CRF
model is then used to detect salient objects during the online testing stage.
We perform experiments on two standard datasets and compare the performance of
our method with different reference methods. Our experiments show that our
method outperforms the compared methods in terms of precision, recall, and
F-Measure
Hierarchical Cellular Automata for Visual Saliency
Saliency detection, finding the most important parts of an image, has become
increasingly popular in computer vision. In this paper, we introduce
Hierarchical Cellular Automata (HCA) -- a temporally evolving model to
intelligently detect salient objects. HCA consists of two main components:
Single-layer Cellular Automata (SCA) and Cuboid Cellular Automata (CCA). As an
unsupervised propagation mechanism, Single-layer Cellular Automata can exploit
the intrinsic relevance of similar regions through interactions with neighbors.
Low-level image features as well as high-level semantic information extracted
from deep neural networks are incorporated into the SCA to measure the
correlation between different image patches. With these hierarchical deep
features, an impact factor matrix and a coherence matrix are constructed to
balance the influences on each cell's next state. The saliency values of all
cells are iteratively updated according to a well-defined update rule.
Furthermore, we propose CCA to integrate multiple saliency maps generated by
SCA at different scales in a Bayesian framework. Therefore, single-layer
propagation and multi-layer integration are jointly modeled in our unified HCA.
Surprisingly, we find that the SCA can improve all existing methods that we
applied it to, resulting in a similar precision level regardless of the
original results. The CCA can act as an efficient pixel-wise aggregation
algorithm that can integrate state-of-the-art methods, resulting in even better
results. Extensive experiments on four challenging datasets demonstrate that
the proposed algorithm outperforms state-of-the-art conventional methods and is
competitive with deep learning based approaches
Salient Object Detection: A Discriminative Regional Feature Integration Approach
Salient object detection has been attracting a lot of interest, and recently
various heuristic computational models have been designed. In this paper, we
formulate saliency map computation as a regression problem. Our method, which
is based on multi-level image segmentation, utilizes the supervised learning
approach to map the regional feature vector to a saliency score. Saliency
scores across multiple levels are finally fused to produce the saliency map.
The contributions lie in two-fold. One is that we propose a discriminate
regional feature integration approach for salient object detection. Compared
with existing heuristic models, our proposed method is able to automatically
integrate high-dimensional regional saliency features and choose discriminative
ones. The other is that by investigating standard generic region properties as
well as two widely studied concepts for salient object detection, i.e.,
regional contrast and backgroundness, our approach significantly outperforms
state-of-the-art methods on six benchmark datasets. Meanwhile, we demonstrate
that our method runs as fast as most existing algorithms
- …