72 research outputs found
Exploring the Space of Adversarial Images
Adversarial examples have raised questions regarding the robustness and
security of deep neural networks. In this work we formalize the problem of
adversarial images given a pretrained classifier, showing that even in the
linear case the resulting optimization problem is nonconvex. We generate
adversarial images using shallow and deep classifiers on the MNIST and ImageNet
datasets. We probe the pixel space of adversarial images using noise of varying
intensity and distribution. We bring novel visualizations that showcase the
phenomenon and its high variability. We show that adversarial images appear in
large regions in the pixel space, but that, for the same task, a shallow
classifier seems more robust to adversarial images than a deep convolutional
network.Comment: Copyright 2016 IEEE. This manuscript was accepted at the IEEE
International Joint Conference on Neural Networks (IJCNN) 2016. We will link
the published version as soon as the DOI is availabl
Learning Robust Representations of Text
Deep neural networks have achieved remarkable results across many language
processing tasks, however these methods are highly sensitive to noise and
adversarial attacks. We present a regularization based method for limiting
network sensitivity to its inputs, inspired by ideas from computer vision, thus
learning models that are more robust. Empirical evaluation over a range of
sentiment datasets with a convolutional neural network shows that, compared to
a baseline model and the dropout method, our method achieves superior
performance over noisy inputs and out-of-domain data.Comment: 5 pages with 2 pages reference, 2 tables, 1 figur
- …