432 research outputs found
Interpreting Adversarially Trained Convolutional Neural Networks
We attempt to interpret how adversarially trained convolutional neural
networks (AT-CNNs) recognize objects. We design systematic approaches to
interpret AT-CNNs in both qualitative and quantitative ways and compare them
with normally trained models. Surprisingly, we find that adversarial training
alleviates the texture bias of standard CNNs when trained on object recognition
tasks, and helps CNNs learn a more shape-biased representation. We validate our
hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and
standard CNNs on clean images and images under different transformations. The
comparison could visually show that the prediction of the two types of CNNs is
sensitive to dramatically different types of features. Second, to achieve
quantitative verification, we construct additional test datasets that destroy
either textures or shapes, such as style-transferred version of clean data,
saturated images and patch-shuffled ones, and then evaluate the classification
accuracy of AT-CNNs and normal CNNs on these datasets. Our findings shed some
light on why AT-CNNs are more robust than those normally trained ones and
contribute to a better understanding of adversarial training over CNNs from an
interpretation perspective.Comment: To apper in ICML1
Robust Semantic Segmentation: Strong Adversarial Attacks and Fast Training of Robust Models
While a large amount of work has focused on designing adversarial attacks
against image classifiers, only a few methods exist to attack semantic
segmentation models. We show that attacking segmentation models presents
task-specific challenges, for which we propose novel solutions. Our final
evaluation protocol outperforms existing methods, and shows that those can
overestimate the robustness of the models. Additionally, so far adversarial
training, the most successful way for obtaining robust image classifiers, could
not be successfully applied to semantic segmentation. We argue that this is
because the task to be learned is more challenging, and requires significantly
higher computational effort than for image classification. As a remedy, we show
that by taking advantage of recent advances in robust ImageNet classifiers, one
can train adversarially robust segmentation models at limited computational
cost by fine-tuning robust backbones
Natural & Adversarial Bokeh Rendering via Circle-of-Confusion Predictive Network
Bokeh effect is a natural shallow depth-of-field phenomenon that blurs the
out-of-focus part in photography. In recent years, a series of works have
proposed automatic and realistic bokeh rendering methods for artistic and
aesthetic purposes. They usually employ cutting-edge data-driven deep
generative networks with complex training strategies and network architectures.
However, these works neglect that the bokeh effect, as a real phenomenon, can
inevitably affect the subsequent visual intelligent tasks like recognition, and
their data-driven nature prevents them from studying the influence of
bokeh-related physical parameters (i.e., depth-of-the-field) on the intelligent
tasks. To fill this gap, we study a totally new problem, i.e., natural &
adversarial bokeh rendering, which consists of two objectives: rendering
realistic and natural bokeh and fooling the visual perception models (i.e.,
bokeh-based adversarial attack). To this end, beyond the pure data-driven
solution, we propose a hybrid alternative by taking the respective advantages
of data-driven and physical-aware methods. Specifically, we propose the
circle-of-confusion predictive network (CoCNet) by taking the all-in-focus
image and depth image as inputs to estimate circle-of-confusion parameters for
each pixel, which are employed to render the final image through a well-known
physical model of bokeh. With the hybrid solution, our method could achieve
more realistic rendering results with the naive training strategy and a much
lighter network.Comment: 11 pages, accepted by TM
- …