62 research outputs found
Automatic Salient Object Detection for Panoramic Images Using Region Growing and Fixation Prediction Model
Almost all previous works on saliency detection have been dedicated to
conventional images, however, with the outbreak of panoramic images due to the
rapid development of VR or AR technology, it is becoming more challenging,
meanwhile valuable for extracting salient contents in panoramic images.
In this paper, we propose a novel bottom-up salient object detection
framework for panoramic images. First, we employ a spatial density estimation
method to roughly extract object proposal regions, with the help of region
growing algorithm. Meanwhile, an eye fixation model is utilized to predict
visually attractive parts in the image from the perspective of the human visual
search mechanism. Then, the previous results are combined by the maxima
normalization to get the coarse saliency map. Finally, a refinement step based
on geodesic distance is utilized for post-processing to derive the final
saliency map.
To fairly evaluate the performance of the proposed approach, we propose a
high-quality dataset of panoramic images (SalPan). Extensive evaluations
demonstrate the effectiveness of our proposed method on panoramic images and
the superiority of the proposed method against other methods.Comment: Previous Project website: https://github.com/ChunbiaoZhu/DCC-201
Exploiting the Value of the Center-dark Channel Prior for Salient Object Detection
Saliency detection aims to detect the most attractive objects in images and
is widely used as a foundation for various applications. In this paper, we
propose a novel salient object detection algorithm for RGB-D images using
center-dark channel priors. First, we generate an initial saliency map based on
a color saliency map and a depth saliency map of a given RGB-D image. Then, we
generate a center-dark channel map based on center saliency and dark channel
priors. Finally, we fuse the initial saliency map with the center dark channel
map to generate the final saliency map. Extensive evaluations over four
benchmark datasets demonstrate that our proposed method performs favorably
against most of the state-of-the-art approaches. Besides, we further discuss
the application of the proposed algorithm in small target detection and
demonstrate the universal value of center-dark channel priors in the field of
object detection.Comment: Project website: https://chunbiaozhu.github.io/ACVR2017
CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement
State-of-the-art semantic segmentation methods were almost exclusively
trained on images within a fixed resolution range. These segmentations are
inaccurate for very high-resolution images since using bicubic upsampling of
low-resolution segmentation does not adequately capture high-resolution details
along object boundaries. In this paper, we propose a novel approach to address
the high-resolution segmentation problem without using any high-resolution
training data. The key insight is our CascadePSP network which refines and
corrects local boundaries whenever possible. Although our network is trained
with low-resolution segmentation data, our method is applicable to any
resolution even for very high-resolution images larger than 4K. We present
quantitative and qualitative studies on different datasets to show that
CascadePSP can reveal pixel-accurate segmentation boundaries using our novel
refinement module without any finetuning. Thus, our method can be regarded as
class-agnostic. Finally, we demonstrate the application of our model to scene
parsing in multi-class segmentation.Comment: Accepted to CVPR2020. Project page:
https://github.com/hkchengrex/CascadePS
Fixation Data Analysis for High Resolution Satellite Images
The presented study is an eye tracking experiment for high-resolution
satellite (HRS) images. The reported experiment explores the Area Of Interest
(AOI) based analysis of eye fixation data for complex HRS images. The study
reflects the requisite of reference data for bottom-up saliency-based
segmentation and the struggle of eye tracking data analysis for complex
satellite images. The intended fixation data analysis aims towards the
reference data creation for bottom-up saliency-based segmentation of
high-resolution satellite images. The analytical outcome of this experimental
study provides a solution for AOI-based analysis for fixation data in the
complex environment of satellite images and recommendations for reference data
construction which is already an ongoing effort.Comment: Extended version is submitted to SPIE-2018 conferenc
Richer and Deeper Supervision Network for Salient Object Detection
Recent Salient Object Detection (SOD) systems are mostly based on
Convolutional Neural Networks (CNNs). Specifically, Deeply Supervised Saliency
(DSS) system has shown it is very useful to add short connections to the
network and supervising on the side output. In this work, we propose a new SOD
system which aims at designing a more efficient and effective way to pass back
global information. Richer and Deeper Supervision (RDS) is applied to better
combine features from each side output without demanding much extra
computational space. Meanwhile, the backbone network used for SOD is normally
pre-trained on the object classification dataset, ImageNet. But the pre-trained
model has been trained on cropped images in order to only focus on
distinguishing features within the region of the object. But the ignored
background information is also significant in the task of SOD. We try to solve
this problem by introducing the training data designed for object detection. A
coarse global information is learned based on an entire image with its bounding
box before training on the SOD dataset. The large-scale of object images can
slightly improve the performance of SOD. Our experiment shows the proposed RDS
network achieves the state-of-the-art results on five public SOD datasets
Reverse Attention for Salient Object Detection
Benefit from the quick development of deep learning techniques, salient
object detection has achieved remarkable progresses recently. However, there
still exists following two major challenges that hinder its application in
embedded devices, low resolution output and heavy model weight. To this end,
this paper presents an accurate yet compact deep network for efficient salient
object detection. More specifically, given a coarse saliency prediction in the
deepest layer, we first employ residual learning to learn side-output residual
features for saliency refinement, which can be achieved with very limited
convolutional parameters while keep accuracy. Secondly, we further propose
reverse attention to guide such side-output residual learning in a top-down
manner. By erasing the current predicted salient regions from side-output
features, the network can eventually explore the missing object parts and
details which results in high resolution and accuracy. Experiments on six
benchmark datasets demonstrate that the proposed approach compares favorably
against state-of-the-art methods, and with advantages in terms of simplicity,
efficiency (45 FPS) and model size (81 MB).Comment: ECCV 201
Fast User-Guided Video Object Segmentation by Interaction-and-Propagation Networks
We present a deep learning method for the interactive video object
segmentation. Our method is built upon two core operations, interaction and
propagation, and each operation is conducted by Convolutional Neural Networks.
The two networks are connected both internally and externally so that the
networks are trained jointly and interact with each other to solve the complex
video object segmentation problem. We propose a new multi-round training scheme
for the interactive video object segmentation so that the networks can learn
how to understand the user's intention and update incorrect estimations during
the training. At the testing time, our method produces high-quality results and
also runs fast enough to work with users interactively. We evaluated the
proposed method quantitatively on the interactive track benchmark at the DAVIS
Challenge 2018. We outperformed other competing methods by a significant margin
in both the speed and the accuracy. We also demonstrated that our method works
well with real user interactions.Comment: CVPR 201
STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation
Recently, significant improvement has been made on semantic object
segmentation due to the development of deep convolutional neural networks
(DCNNs). Training such a DCNN usually relies on a large number of images with
pixel-level segmentation masks, and annotating these images is very costly in
terms of both finance and human effort. In this paper, we propose a simple to
complex (STC) framework in which only image-level annotations are utilized to
learn DCNNs for semantic segmentation. Specifically, we first train an initial
segmentation network called Initial-DCNN with the saliency maps of simple
images (i.e., those with a single category of major object(s) and clean
background). These saliency maps can be automatically obtained by existing
bottom-up salient object detection techniques, where no supervision information
is needed. Then, a better network called Enhanced-DCNN is learned with
supervision from the predicted segmentation masks of simple images based on the
Initial-DCNN as well as the image-level annotations. Finally, more pixel-level
segmentation masks of complex images (two or more categories of objects with
cluttered background), which are inferred by using Enhanced-DCNN and
image-level annotations, are utilized as the supervision information to learn
the Powerful-DCNN for semantic segmentation. Our method utilizes K simple
images from Flickr.com and 10K complex images from PASCAL VOC for step-wisely
boosting the segmentation network. Extensive experimental results on PASCAL VOC
2012 segmentation benchmark well demonstrate the superiority of the proposed
STC framework compared with other state-of-the-arts.Comment: To Appear in IEEE Transactions on Pattern Analysis and Machine
Intelligenc
PDNet: Prior-model Guided Depth-enhanced Network for Salient Object Detection
Fully convolutional neural networks (FCNs) have shown outstanding performance
in many computer vision tasks including salient object detection. However,
there still remains two issues needed to be addressed in deep learning based
saliency detection. One is the lack of tremendous amount of annotated data to
train a network. The other is the lack of robustness for extracting salient
objects in images containing complex scenes. In this paper, we present a new
architecturePDNet, a robust prior-model guided depth-enhanced network for
RGB-D salient object detection. In contrast to existing works, in which RGB-D
values of image pixels are fed directly to a network, the proposed architecture
is composed of a master network for processing RGB values, and a sub-network
making full use of depth cues and incorporate depth-based features into the
master network. To overcome the limited size of the labeled RGB-D dataset for
training, we employ a large conventional RGB dataset to pre-train the master
network, which proves to contribute largely to the final accuracy. Extensive
evaluations over five benchmark datasets demonstrate that our proposed method
performs favorably against the state-of-the-art approaches.Comment: This paper is under review. Project website:
https://github.com/ChunbiaoZhu/PDNet
SAD: Saliency-based Defenses Against Adversarial Examples
With the rise in popularity of machine and deep learning models, there is an
increased focus on their vulnerability to malicious inputs. These adversarial
examples drift model predictions away from the original intent of the network
and are a growing concern in practical security. In order to combat these
attacks, neural networks can leverage traditional image processing approaches
or state-of-the-art defensive models to reduce perturbations in the data.
Defensive approaches that take a global approach to noise reduction are
effective against adversarial attacks, however their lossy approach often
distorts important data within the image. In this work, we propose a visual
saliency based approach to cleaning data affected by an adversarial attack. Our
model leverages the salient regions of an adversarial image in order to provide
a targeted countermeasure while comparatively reducing loss within the cleaned
images. We measure the accuracy of our model by evaluating the effectiveness of
state-of-the-art saliency methods prior to attack, under attack, and after
application of cleaning methods. We demonstrate the effectiveness of our
proposed approach in comparison with related defenses and against established
adversarial attack methods, across two saliency datasets. Our targeted approach
shows significant improvements in a range of standard statistical and distance
saliency metrics, in comparison with both traditional and state-of-the-art
approaches.Comment: 9 page
- …