816 research outputs found
Automatic Salient Object Detection for Panoramic Images Using Region Growing and Fixation Prediction Model
Almost all previous works on saliency detection have been dedicated to
conventional images, however, with the outbreak of panoramic images due to the
rapid development of VR or AR technology, it is becoming more challenging,
meanwhile valuable for extracting salient contents in panoramic images.
In this paper, we propose a novel bottom-up salient object detection
framework for panoramic images. First, we employ a spatial density estimation
method to roughly extract object proposal regions, with the help of region
growing algorithm. Meanwhile, an eye fixation model is utilized to predict
visually attractive parts in the image from the perspective of the human visual
search mechanism. Then, the previous results are combined by the maxima
normalization to get the coarse saliency map. Finally, a refinement step based
on geodesic distance is utilized for post-processing to derive the final
saliency map.
To fairly evaluate the performance of the proposed approach, we propose a
high-quality dataset of panoramic images (SalPan). Extensive evaluations
demonstrate the effectiveness of our proposed method on panoramic images and
the superiority of the proposed method against other methods.Comment: Previous Project website: https://github.com/ChunbiaoZhu/DCC-201
Co-salient Object Detection Based on Deep Saliency Networks and Seed Propagation over an Integrated Graph
This paper presents a co-salient object detection method to find common
salient regions in a set of images. We utilize deep saliency networks to
transfer co-saliency prior knowledge and better capture high-level semantic
information, and the resulting initial co-saliency maps are enhanced by seed
propagation steps over an integrated graph. The deep saliency networks are
trained in a supervised manner to avoid online weakly supervised learning and
exploit them not only to extract high-level features but also to produce both
intra- and inter-image saliency maps. Through a refinement step, the initial
co-saliency maps can uniformly highlight co-salient regions and locate accurate
object boundaries. To handle input image groups inconsistent in size, we
propose to pool multi-regional descriptors including both within-segment and
within-group information. In addition, the integrated multilayer graph is
constructed to find the regions that the previous steps may not detect by seed
propagation with low-level descriptors. In this work, we utilize the useful
complementary components of high-, low-level information, and several
learning-based steps. Our experiments have demonstrated that the proposed
approach outperforms comparable co-saliency detection methods on widely used
public databases and can also be directly applied to co-segmentation tasks.Comment: 13 pages, 10 figures, 3 table
HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images
Co-saliency detection aims to discover common and salient objects in an image
group containing more than two relevant images. Moreover, depth information has
been demonstrated to be effective for many computer vision tasks. In this
paper, we propose a novel co-saliency detection method for RGBD images based on
hierarchical sparsity reconstruction and energy function refinement. With the
assistance of the intra saliency map, the inter-image correspondence is
formulated as a hierarchical sparsity reconstruction framework. The global
sparsity reconstruction model with a ranking scheme focuses on capturing the
global characteristics among the whole image group through a common foreground
dictionary. The pairwise sparsity reconstruction model aims to explore the
corresponding relationship between pairwise images through a set of pairwise
dictionaries. In order to improve the intra-image smoothness and inter-image
consistency, an energy function refinement model is proposed, which includes
the unary data term, spatial smooth term, and holistic consistency term.
Experiments on two RGBD co-saliency detection benchmarks demonstrate that the
proposed method outperforms the state-of-the-art algorithms both qualitatively
and quantitatively.Comment: 11 pages, 5 figures, Accepted by IEEE Transactions on Multimedia,
https://rmcong.github.io
Multi-interactive Dual-decoder for RGB-thermal Salient Object Detection
RGB-thermal salient object detection (SOD) aims to segment the common
prominent regions of visible image and corresponding thermal infrared image
that we call it RGBT SOD. Existing methods don't fully explore and exploit the
potentials of complementarity of different modalities and multi-type cues of
image contents, which play a vital role in achieving accurate results. In this
paper, we propose a multi-interactive dual-decoder to mine and model the
multi-type interactions for accurate RGBT SOD. In specific, we first encode two
modalities into multi-level multi-modal feature representations. Then, we
design a novel dual-decoder to conduct the interactions of multi-level
features, two modalities and global contexts. With these interactions, our
method works well in diversely challenging scenarios even in the presence of
invalid modality. Finally, we carry out extensive experiments on public RGBT
and RGBD SOD datasets, and the results show that the proposed method achieves
the outstanding performance against state-of-the-art algorithms. The source
code has been released
at:https://github.com/lz118/Multi-interactive-Dual-decoder.Comment: Accepted by IEEE TI
DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection
A key problem in salient object detection is how to effectively model the
semantic properties of salient objects in a data-driven manner. In this paper,
we propose a multi-task deep saliency model based on a fully convolutional
neural network (FCNN) with global input (whole raw images) and global output
(whole saliency maps). In principle, the proposed saliency model takes a
data-driven strategy for encoding the underlying saliency prior information,
and then sets up a multi-task learning scheme for exploring the intrinsic
correlations between saliency detection and semantic image segmentation.
Through collaborative feature learning from such two correlated tasks, the
shared fully convolutional layers produce effective features for object
perception. Moreover, it is capable of capturing the semantic information on
salient objects across different levels using the fully convolutional layers,
which investigate the feature-sharing properties of salient object detection
with great feature redundancy reduction. Finally, we present a graph Laplacian
regularized nonlinear regression model for saliency refinement. Experimental
results demonstrate the effectiveness of our approach in comparison with the
state-of-the-art approaches.Comment: To appear in IEEE Transactions on Image Processing (TIP), Project
Website: http://www.zhaoliming.net/research/deepsalienc
Review of Visual Saliency Detection with Comprehensive Information
Visual saliency detection model simulates the human visual system to perceive
the scene, and has been widely used in many vision tasks. With the acquisition
technology development, more comprehensive information, such as depth cue,
inter-image correspondence, or temporal relationship, is available to extend
image saliency detection to RGBD saliency detection, co-saliency detection, or
video saliency detection. RGBD saliency detection model focuses on extracting
the salient regions from RGBD images by combining the depth information.
Co-saliency detection model introduces the inter-image correspondence
constraint to discover the common salient object in an image group. The goal of
video saliency detection model is to locate the motion-related salient object
in video sequences, which considers the motion cue and spatiotemporal
constraint jointly. In this paper, we review different types of saliency
detection algorithms, summarize the important issues of the existing methods,
and discuss the existent problems and future works. Moreover, the evaluation
datasets and quantitative measurements are briefly introduced, and the
experimental analysis and discission are conducted to provide a holistic
overview of different saliency detection methods.Comment: 18 pages, 11 figures, 7 tables, Accepted by IEEE Transactions on
Circuits and Systems for Video Technology 2018, https://rmcong.github.io
Coarse-to-Fine Salient Object Detection with Low-Rank Matrix Recovery
Low-Rank Matrix Recovery (LRMR) has recently been applied to saliency
detection by decomposing image features into a low-rank component associated
with background and a sparse component associated with visual salient regions.
Despite its great potential, existing LRMR-based saliency detection methods
seldom consider the inter-relationship among elements within these two
components, thus are prone to generating scattered or incomplete saliency maps.
In this paper, we introduce a novel and efficient LRMR-based saliency detection
model under a coarse-to-fine framework to circumvent this limitation. First, we
roughly measure the saliency of image regions with a baseline LRMR model that
integrates a -norm sparsity constraint and a Laplacian regularization
smooth term. Given samples from the coarse saliency map, we then learn a
projection that maps image features to refined saliency values, to
significantly sharpen the object boundaries and to preserve the object
entirety. We evaluate our framework against existing LRMR-based methods on
three benchmark datasets. Experimental results validate the superiority of our
method as well as the effectiveness of our suggested coarse-to-fine framework,
especially for images containing multiple objects.Comment: Manuscript accepted by Neurocomputing, matlab code is available from
https://github.com/qizhust/HLRSalienc
Relative Saliency and Ranking: Models, Metrics, Data, and Benchmarks
Salient object detection is a problem that has been considered in detail and
\textcolor{black}{many solutions have been proposed}. In this paper, we argue
that work to date has addressed a problem that is relatively ill-posed.
Specifically, there is not universal agreement about what constitutes a salient
object when multiple observers are queried. This implies that some objects are
more likely to be judged salient than others, and implies a relative rank
exists on salient objects. Initially, we present a novel deep learning solution
based on a hierarchical representation of relative saliency and stage-wise
refinement. Further to this, we present data, analysis and baseline benchmark
results towards addressing the problem of salient object ranking. Methods for
deriving suitable ranked salient object instances are presented, along with
metrics suitable to measuring algorithm performance. In addition, we show how a
derived dataset can be successively refined to provide cleaned results that
correlate well with pristine ground truth in its characteristics and value for
training and testing models. Finally, we provide a comparison among prevailing
algorithms that address salient object ranking or detection to establish
initial baselines providing a basis for comparison with future efforts
addressing this problem. \textcolor{black}{The source code and data are
publicly available via our project page:}
\textrm{\href{https://ryersonvisionlab.github.io/cocosalrank.html}{ryersonvisionlab.github.io/cocosalrank}}Comment: Accepted to Transaction on Pattern Analysis and Machine Intelligence.
arXiv admin note: substantial text overlap with arXiv:1803.0508
Deep Reasoning with Multi-Scale Context for Salient Object Detection
To detect salient objects accurately, existing methods usually design complex
backbone network architectures to learn and fuse powerful features. However,
the saliency inference module that performs saliency prediction from the fused
features receives much less attention on its architecture design and typically
adopts only a few fully convolutional layers. In this paper, we find the
limited capacity of the saliency inference module indeed makes a fundamental
performance bottleneck, and enhancing its capacity is critical for obtaining
better saliency prediction. Correspondingly, we propose a deep yet light-weight
saliency inference module that adopts a multi-dilated depth-wise convolution
architecture. Such a deep inference module, though with simple architecture,
can directly perform reasoning about salient objects from the multi-scale
convolutional features fast, and give superior salient object detection
performance with less computational cost. To our best knowledge, we are the
first to reveal the importance of the inference module for salient object
detection, and present a novel architecture design with attractive efficiency
and accuracy. Extensive experimental evaluations demonstrate that our simple
framework performs favorably compared with the state-of-the-art methods with
complex backbone design.Comment: 10 pages, 8 figures, 3 tabl
Ro-SOS: Metric Expression Network (MEnet) for Robust Salient Object Segmentation
Although deep CNNs have brought significant improvement to image saliency
detection, most CNN based models are sensitive to distortion such as
compression and noise. In this paper, we propose an end-to-end generic salient
object segmentation model called Metric Expression Network (MEnet) to deal with
saliency detection with the tolerance of distortion. Within MEnet, a new
topological metric space is constructed, whose implicit metric is determined by
the deep network. As a result, we manage to group all the pixels in the
observed image semantically within this latent space into two regions: a
salient region and a non-salient region. With this architecture, all feature
extractions are carried out at the pixel level, enabling fine granularity of
output boundaries of the salient objects. What's more, we try to give a general
analysis for the noise robustness of the network in the sense of Lipschitz and
Jacobian literature. Experiments demonstrate that robust salient maps
facilitating object segmentation can be generated by the proposed metric. Tests
on several public benchmarks show that MEnet has achieved desirable
performance. Furthermore, by direct computation and measuring the robustness,
the proposed method outperforms previous CNN-based methods on distorted inputs.Comment: This version: 11 pages (12 with reference), 12 figures, 5 table;
Version 1: 7 pages,7 figures, 4 tables; The paper for version 1 has been
accepted by International Joint Conference on Artificial Intelligence
(IJCAI),201
- …