7,328 research outputs found
Few-Cost Salient Object Detection with Adversarial-Paced Learning
Detecting and segmenting salient objects from given image scenes has received
great attention in recent years. A fundamental challenge in training the
existing deep saliency detection models is the requirement of large amounts of
annotated data. While gathering large quantities of training data becomes cheap
and easy, annotating the data is an expensive process in terms of time, labor
and human expertise. To address this problem, this paper proposes to learn the
effective salient object detection model based on the manual annotation on a
few training images only, thus dramatically alleviating human labor in training
models. To this end, we name this task as the few-cost salient object detection
and propose an adversarial-paced learning (APL)-based framework to facilitate
the few-cost learning scenario. Essentially, APL is derived from the self-paced
learning (SPL) regime but it infers the robust learning pace through the
data-driven adversarial learning mechanism rather than the heuristic design of
the learning regularizer. Comprehensive experiments on four widely-used
benchmark datasets demonstrate that the proposed method can effectively
approach to the existing supervised deep salient object detection models with
only 1k human-annotated training images. The project page is available at
https://github.com/hb-stone/FC-SOD
LeNo: Adversarial Robust Salient Object Detection Networks with Learnable Noise
Pixel-wise predction with deep neural network has become an effective
paradigm for salient object detection (SOD) and achieved remakable performance.
However, very few SOD models are robust against adversarial attacks which are
visually imperceptible for human visual attention. The previous work robust
salient object detection against adversarial attacks (ROSA) shuffles the
pre-segmented superpixels and then refines the coarse saliency map by the
densely connected CRF. Different from ROSA that rely on various pre- and
post-processings, this paper proposes a light-weight Learnble Noise (LeNo) to
against adversarial attacks for SOD models. LeNo preserves accuracy of SOD
models on both adversarial and clean images, as well as inference speed. In
general, LeNo consists of a simple shallow noise and noise estimation that
embedded in the encoder and decoder of arbitrary SOD networks respectively.
Inspired by the center prior of human visual attention mechanism, we initialize
the shallow noise with a cross-shaped gaussian distribution for better defense
against adversarial attacks. Instead of adding additional network components
for post-processing, the proposed noise estimation modifies only one channel of
the decoder. With the deeply-supervised noise-decoupled training on
state-of-the-art RGB and RGB-D SOD networks, LeNo outperforms previous works
not only on adversarial images but also clean images, which contributes
stronger robustness for SOD.Comment: 8 pages, 5 figures, submitted to AAA
How is Gaze Influenced by Image Transformations? Dataset and Model
Data size is the bottleneck for developing deep saliency models, because
collecting eye-movement data is very time consuming and expensive. Most of
current studies on human attention and saliency modeling have used high quality
stereotype stimuli. In real world, however, captured images undergo various
types of transformations. Can we use these transformations to augment existing
saliency datasets? Here, we first create a novel saliency dataset including
fixations of 10 observers over 1900 images degraded by 19 types of
transformations. Second, by analyzing eye movements, we find that observers
look at different locations over transformed versus original images. Third, we
utilize the new data over transformed images, called data augmentation
transformation (DAT), to train deep saliency models. We find that label
preserving DATs with negligible impact on human gaze boost saliency prediction,
whereas some other DATs that severely impact human gaze degrade the
performance. These label preserving valid augmentation transformations provide
a solution to enlarge existing saliency datasets. Finally, we introduce a novel
saliency model based on generative adversarial network (dubbed GazeGAN). A
modified UNet is proposed as the generator of the GazeGAN, which combines
classic skip connections with a novel center-surround connection (CSC), in
order to leverage multi level features. We also propose a histogram loss based
on Alternative Chi Square Distance (ACS HistLoss) to refine the saliency map in
terms of luminance distribution. Extensive experiments and comparisons over 3
datasets indicate that GazeGAN achieves the best performance in terms of
popular saliency evaluation metrics, and is more robust to various
perturbations. Our code and data are available at:
https://github.com/CZHQuality/Sal-CFS-GAN
- …