1,166 research outputs found
Explaining Classifiers using Adversarial Perturbations on the Perceptual Ball
We present a simple regularization of adversarial perturbations based upon
the perceptual loss. While the resulting perturbations remain imperceptible to
the human eye, they differ from existing adversarial perturbations in that they
are semi-sparse alterations that highlight objects and regions of interest
while leaving the background unaltered. As a semantically meaningful adverse
perturbations, it forms a bridge between counterfactual explanations and
adversarial perturbations in the space of images. We evaluate our approach on
several standard explainability benchmarks, namely, weak localization,
insertion deletion, and the pointing game demonstrating that perceptually
regularized counterfactuals are an effective explanation for image-based
classifiers.Comment: CVPR 202
LeNo: Adversarial Robust Salient Object Detection Networks with Learnable Noise
Pixel-wise predction with deep neural network has become an effective
paradigm for salient object detection (SOD) and achieved remakable performance.
However, very few SOD models are robust against adversarial attacks which are
visually imperceptible for human visual attention. The previous work robust
salient object detection against adversarial attacks (ROSA) shuffles the
pre-segmented superpixels and then refines the coarse saliency map by the
densely connected CRF. Different from ROSA that rely on various pre- and
post-processings, this paper proposes a light-weight Learnble Noise (LeNo) to
against adversarial attacks for SOD models. LeNo preserves accuracy of SOD
models on both adversarial and clean images, as well as inference speed. In
general, LeNo consists of a simple shallow noise and noise estimation that
embedded in the encoder and decoder of arbitrary SOD networks respectively.
Inspired by the center prior of human visual attention mechanism, we initialize
the shallow noise with a cross-shaped gaussian distribution for better defense
against adversarial attacks. Instead of adding additional network components
for post-processing, the proposed noise estimation modifies only one channel of
the decoder. With the deeply-supervised noise-decoupled training on
state-of-the-art RGB and RGB-D SOD networks, LeNo outperforms previous works
not only on adversarial images but also clean images, which contributes
stronger robustness for SOD.Comment: 8 pages, 5 figures, submitted to AAA
Memory-aided Contrastive Consensus Learning for Co-salient Object Detection
Co-Salient Object Detection (CoSOD) aims at detecting common salient objects
within a group of relevant source images. Most of the latest works employ the
attention mechanism for finding common objects. To achieve accurate CoSOD
results with high-quality maps and high efficiency, we propose a novel
Memory-aided Contrastive Consensus Learning (MCCL) framework, which is capable
of effectively detecting co-salient objects in real time (~150 fps). To learn
better group consensus, we propose the Group Consensus Aggregation Module
(GCAM) to abstract the common features of each image group; meanwhile, to make
the consensus representation more discriminative, we introduce the Memory-based
Contrastive Module (MCM), which saves and updates the consensus of images from
different groups in a queue of memories. Finally, to improve the quality and
integrity of the predicted maps, we develop an Adversarial Integrity Learning
(AIL) strategy to make the segmented regions more likely composed of complete
objects with less surrounding noise. Extensive experiments on all the latest
CoSOD benchmarks demonstrate that our lite MCCL outperforms 13 cutting-edge
models, achieving the new state of the art (~5.9% and ~6.2% improvement in
S-measure on CoSOD3k and CoSal2015, respectively). Our source codes, saliency
maps, and online demos are publicly available at
https://github.com/ZhengPeng7/MCCL.Comment: AAAI 202
Attentive Single-Tasking of Multiple Tasks
In this work we address task interference in universal networks by
considering that a network is trained on multiple tasks, but performs one task
at a time, an approach we refer to as "single-tasking multiple tasks". The
network thus modifies its behaviour through task-dependent feature adaptation,
or task attention. This gives the network the ability to accentuate the
features that are adapted to a task, while shunning irrelevant ones. We further
reduce task interference by forcing the task gradients to be statistically
indistinguishable through adversarial training, ensuring that the common
backbone architecture serving all tasks is not dominated by any of the
task-specific gradients. Results in three multi-task dense labelling problems
consistently show: (i) a large reduction in the number of parameters while
preserving, or even improving performance and (ii) a smooth trade-off between
computation and multi-task accuracy. We provide our system's code and
pre-trained models at http://vision.ee.ethz.ch/~kmaninis/astmt/.Comment: CVPR 2019 Camera Read
- …