328 research outputs found
HSCS: Hierarchical Sparsity Based Co-saliency Detection for RGBD Images
Co-saliency detection aims to discover common and salient objects in an image
group containing more than two relevant images. Moreover, depth information has
been demonstrated to be effective for many computer vision tasks. In this
paper, we propose a novel co-saliency detection method for RGBD images based on
hierarchical sparsity reconstruction and energy function refinement. With the
assistance of the intra saliency map, the inter-image correspondence is
formulated as a hierarchical sparsity reconstruction framework. The global
sparsity reconstruction model with a ranking scheme focuses on capturing the
global characteristics among the whole image group through a common foreground
dictionary. The pairwise sparsity reconstruction model aims to explore the
corresponding relationship between pairwise images through a set of pairwise
dictionaries. In order to improve the intra-image smoothness and inter-image
consistency, an energy function refinement model is proposed, which includes
the unary data term, spatial smooth term, and holistic consistency term.
Experiments on two RGBD co-saliency detection benchmarks demonstrate that the
proposed method outperforms the state-of-the-art algorithms both qualitatively
and quantitatively.Comment: 11 pages, 5 figures, Accepted by IEEE Transactions on Multimedia,
https://rmcong.github.io
DISC: Deep Image Saliency Computing via Progressive Representation Learning
Salient object detection increasingly receives attention as an important
component or step in several pattern recognition and image processing tasks.
Although a variety of powerful saliency models have been intensively proposed,
they usually involve heavy feature (or model) engineering based on priors (or
assumptions) about the properties of objects and backgrounds. Inspired by the
effectiveness of recently developed feature learning, we provide a novel Deep
Image Saliency Computing (DISC) framework for fine-grained image saliency
computing. In particular, we model the image saliency from both the coarse- and
fine-level observations, and utilize the deep convolutional neural network
(CNN) to learn the saliency representation in a progressive manner.
Specifically, our saliency model is built upon two stacked CNNs. The first CNN
generates a coarse-level saliency map by taking the overall image as the input,
roughly identifying saliency regions in the global context. Furthermore, we
integrate superpixel-based local context information in the first CNN to refine
the coarse-level saliency map. Guided by the coarse saliency map, the second
CNN focuses on the local context to produce fine-grained and accurate saliency
map while preserving object details. For a testing image, the two CNNs
collaboratively conduct the saliency computing in one shot. Our DISC framework
is capable of uniformly highlighting the objects-of-interest from complex
background while preserving well object details. Extensive experiments on
several standard benchmarks suggest that DISC outperforms other
state-of-the-art methods and it also generalizes well across datasets without
additional training. The executable version of DISC is available online:
http://vision.sysu.edu.cn/projects/DISC.Comment: This manuscript is the accepted version for IEEE Transactions on
Neural Networks and Learning Systems (T-NNLS), 201
Backtracking Spatial Pyramid Pooling (SPP)-based Image Classifier for Weakly Supervised Top-down Salient Object Detection
Top-down saliency models produce a probability map that peaks at target
locations specified by a task/goal such as object detection. They are usually
trained in a fully supervised setting involving pixel-level annotations of
objects. We propose a weakly supervised top-down saliency framework using only
binary labels that indicate the presence/absence of an object in an image.
First, the probabilistic contribution of each image region to the confidence of
a CNN-based image classifier is computed through a backtracking strategy to
produce top-down saliency. From a set of saliency maps of an image produced by
fast bottom-up saliency approaches, we select the best saliency map suitable
for the top-down task. The selected bottom-up saliency map is combined with the
top-down saliency map. Features having high combined saliency are used to train
a linear SVM classifier to estimate feature saliency. This is integrated with
combined saliency and further refined through a multi-scale
superpixel-averaging of saliency map. We evaluate the performance of the
proposed weakly supervised topdown saliency and achieve comparable performance
with fully supervised approaches. Experiments are carried out on seven
challenging datasets and quantitative results are compared with 40 closely
related approaches across 4 different applications.Comment: 14 pages, 7 figure
A Classifier-guided Approach for Top-down Salient Object Detection
We propose a framework for top-down salient object detection that
incorporates a tightly coupled image classification module. The classifier is
trained on novel category-aware sparse codes computed on object dictionaries
used for saliency modeling. A misclassification indicates that the
corresponding saliency model is inaccurate. Hence, the classifier selects
images for which the saliency models need to be updated. The category-aware
sparse coding produces better image classification accuracy as compared to
conventional sparse coding with a reduced computational complexity. A
saliency-weighted max-pooling is proposed to improve image classification,
which is further used to refine the saliency maps. Experimental results on
Graz-02 and PASCAL VOC-07 datasets demonstrate the effectiveness of salient
object detection. Although the role of the classifier is to support salient
object detection, we evaluate its performance in image classification and also
illustrate the utility of thresholded saliency maps for image segmentation.Comment: To appear in Signal Processing: Image Communication, Elsevier.
Available online from April 201
Deep Saliency with Encoded Low level Distance Map and High Level Features
Recent advances in saliency detection have utilized deep learning to obtain
high level features to detect salient regions in a scene. These advances have
demonstrated superior results over previous works that utilize hand-crafted low
level features for saliency detection. In this paper, we demonstrate that
hand-crafted features can provide complementary information to enhance
performance of saliency detection that utilizes only high level features. Our
method utilizes both high level and low level features for saliency detection
under a unified deep learning framework. The high level features are extracted
using the VGG-net, and the low level features are compared with other parts of
an image to form a low level distance map. The low level distance map is then
encoded using a convolutional neural network(CNN) with multiple 1X1
convolutional and ReLU layers. We concatenate the encoded low level distance
map and the high level features, and connect them to a fully connected neural
network classifier to evaluate the saliency of a query region. Our experiments
show that our method can further improve the performance of state-of-the-art
deep learning-based saliency detection methods.Comment: Accepted by IEEE Conference on Computer Vision and Pattern
Recognition(CVPR) 2016. Project page:
https://github.com/gylee1103/SaliencyEL
Adapted and Oversegmenting Graphs: Application to Geometric Deep Learning
We propose a novel iterative method to adapt a a graph to d-dimensional image
data. The method drives the nodes of the graph towards image features. The
adaptation process naturally lends itself to a measure of feature saliency
which can then be used to retain meaningful nodes and edges in the graph. From
the adapted graph, we also propose the computation of a dual graph, which
inherits the saliency measure from the adapted graph, and whose edges run along
image features, hence producing an oversegmenting graph. The proposed method is
computationally efficient and fully parallelisable. We propose two distance
measures to find image saliency along graph edges, and evaluate the performance
on synthetic images and on natural images from publicly available databases. In
both cases, the most salient nodes of the graph achieve average boundary recall
over 90%. We also apply our method to image classification on the MNIST
hand-written digit dataset, using a recently proposed Deep Geometric Learning
architecture, and achieving state-of-the-art classification accuracy, for a
graph-based method, of 97.86%.Comment: Submited to CVI
Learning RGB-D Salient Object Detection using background enclosure, depth contrast, and top-down features
Recently, deep Convolutional Neural Networks (CNN) have demonstrated strong
performance on RGB salient object detection. Although, depth information can
help improve detection results, the exploration of CNNs for RGB-D salient
object detection remains limited. Here we propose a novel deep CNN architecture
for RGB-D salient object detection that exploits high-level, mid-level, and low
level features. Further, we present novel depth features that capture the ideas
of background enclosure and depth contrast that are suitable for a learned
approach. We show improved results compared to state-of-the-art RGB-D salient
object detection methods. We also show that the low-level and mid-level depth
features both contribute to improvements in the results. Especially, F-Score of
our method is 0.848 on RGBD1000 dataset, which is 10.7% better than the second
place
A novel graph structure for salient object detection based on divergence background and compact foreground
In this paper, we propose an efficient and discriminative model for salient
object detection. Our method is carried out in a stepwise mechanism based on
both divergence background and compact foreground cues. In order to effectively
enhance the distinction between nodes along object boundaries and the
similarity among object regions, a graph is constructed by introducing the
concept of virtual node. To remove incorrect outputs, a scheme for selecting
background seeds and a method for generating compactness foreground regions are
introduced, respectively. Different from prior methods, we calculate the
saliency value of each node based on the relationship between the corresponding
node and the virtual node. In order to achieve significant performance
improvement consistently, we propose an Extended Manifold Ranking (EMR)
algorithm, which subtly combines suppressed / active nodes and mid-level
information. Extensive experimental results demonstrate that the proposed
algorithm performs favorably against the state-of-art saliency detection
methods in terms of different evaluation metrics on several benchmark datasets.Comment: 22 pages,16 figures, 2 table
NeRD: a Neural Response Divergence Approach to Visual Salience Detection
In this paper, a novel approach to visual salience detection via Neural
Response Divergence (NeRD) is proposed, where synaptic portions of deep neural
networks, previously trained for complex object recognition, are leveraged to
compute low level cues that can be used to compute image region
distinctiveness. Based on this concept , an efficient visual salience detection
framework is proposed using deep convolutional StochasticNets. Experimental
results using CSSD and MSRA10k natural image datasets show that the proposed
NeRD approach can achieve improved performance when compared to
state-of-the-art image saliency approaches, while the attaining low
computational complexity necessary for near-real-time computer vision
applications.Comment: 5 page
Saliency Detection with Spaces of Background-based Distribution
In this letter, an effective image saliency detection method is proposed by
constructing some novel spaces to model the background and redefine the
distance of the salient patches away from the background. Concretely, given the
backgroundness prior, eigendecomposition is utilized to create four spaces of
background-based distribution (SBD) to model the background, in which a more
appropriate metric (Mahalanobis distance) is quoted to delicately measure the
saliency of every image patch away from the background. After that, a coarse
saliency map is obtained by integrating the four adjusted Mahalanobis distance
maps, each of which is formed by the distances between all the patches and
background in the corresponding SBD. To be more discriminative, the coarse
saliency map is further enhanced into the posterior probability map within
Bayesian perspective. Finally, the final saliency map is generated by properly
refining the posterior probability map with geodesic distance. Experimental
results on two usual datasets show that the proposed method is effective
compared with the state-of-the-art algorithms.Comment: 5 pages, 6 figures, Accepted by IEEE Signal Processing Letters in
March 201
- …